I was using the tag list taken from the tag-complete extension but it was missing several artists and characters that work in newer models. The repo contains both a premade csv and the interactive script used to create it. The list is validated to work with SwarmUI and should also work with any UI that supports the original list from tag-complete.

34

1

LoRA fine tuned on real NASA images (www.reddit.com)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Cheap-Ambassador-304 on 2024-10-24 11:49:57+00:00.

35

1

How to run Mochi 1 on a single 24gb VRAM card. (old.reddit.com)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Total-Resort-3120 on 2024-10-24 11:21:53+00:00.

Intro:

If you haven't seen it yet, there's a new model called Mochi 1 that displays incredible video capabilities, and the good news for us is that it's local and has an Apache 2.0 licence: https://x.com/genmoai/status/1848762405779574990

Our overloard kijai made a ComfyUi node that makes this feat possible in the first place, here's how it works:

The text encoder t5xxl is loaded (~9gb vram) to encode your prompt, then it's unloads.
Mochi 1 gets loaded, you can choose between fp8 (up to 361 frames before memory overflow -> 15 sec (24fps)) or bf16 (up to 61 frames before overflow -> 2.5 seconds (24fps)), then it unloads
The VAE will transform the result into a video, this is the part that asks for way more than simply 24gb of VRAM. Fortunatly for us we have a technique called vae_tilting that'll make the calculations bit by bit so that it won't overflow our 24gb VRAM card. You don't need to tinker with those values, he made a workflow for it and it just works.

How to install:

1) Go to the ComfyUI_windows_portable\ComfyUI\custom_nodes folder, open cmd and type this command:

git clone https://github.com/kijai/ComfyUI-MochiWrapper

2) Go to the ComfyUI_windows_portable\update folder, open cmd and type those 2 commands:

..\python_embeded\python.exe -s -m pip install accelerate

..\python_embeded\python.exe -s -m pip install einops

3) You have 3 optimization choices when running this model, sdpa, flash_attn and sage_attn

sage_attn is the fastest of the 3, so only this one will matter there.

Go to the ComfyUI_windows_portable\update folder, open cmd and type this command:

..\python_embeded\python.exe -s -m pip install sageattention

4) To use sage_attn you need triton, for windows it's quite tricky to install but it's definitely possible:

I highly suggest you to have torch 2.5.0 + cuda 12.4 to keep things running smoothly, if you're not sure you have it, go to the ComfyUI_windows_portable\update folder, open cmd and type this command:

..\python_embeded\python.exe -s -m pip install --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

Once you've done that, go to this link: , download the triton-3.1.0-cp311-cp311-win_amd64.whl binary and put it on the ComfyUI_windows_portable\update folder
Go to the ComfyUI_windows_portable\update folder, open cmd and type this command:

..\python_embeded\python.exe -s -m pip install triton-3.1.0-cp311-cp311-win_amd64.whl

5) Triton still won't work if we don't do this:

Install python 3.11.9 on your computer
Go to C:\Users\Home\AppData\Local\Programs\Python\Python311 and copy the libs and include folders
Paste those folders onto ComfyUI_windows_portable\python_embeded

Triton and sage attention should be working now.

6) Download the fp8 or the bf16 model

Go to ComfyUI_windows_portable\ComfyUI\models and create a folder named "diffusion_models"
Go to ComfyUI_windows_portable\ComfyUI\models\diffusion_models, create a folder named "mochi" and put your model in there.

7) Download the VAE

Go to ComfyUI_windows_portable\ComfyUI\models\vae, create a folder named "mochi" and put your VAE in there

8) Download the text encoder

Go to ComfyUI_windows_portable\ComfyUI\models\clip, and put your text encoder in there.

And there you have it, now that everything is settled in, load this workflow on ComfyUi and you can make your own AI videos, have fun!

A 22 years old woman dancing in a Hotel Room, she is holding a Pikachu plush

36

1

OmniGen Image Generations (www.reddit.com)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/TemperFugit on 2024-10-24 00:41:07+00:00.

37

1

Made with SD3.5 Large (www.reddit.com)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/zengccfun on 2024-10-24 00:22:48+00:00.

38

1

SD3.5 first generations. (www.reddit.com)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Treeshark12 on 2024-10-23 22:52:05+00:00.

39

1

Papers without Code (old.reddit.com)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/_lordsoffallen on 2024-10-23 16:38:09+00:00.

I've been trying to read some research papers in the image generation field and what I noticed that quite some researchers they announce on their GitHub site or in the paper that they will release the code soon but they NEVER do. Some papers go back almost two years now. At this point I can't really take any of the results seriously since there's nothing to validate, for all I know it could all be fake. Am I missing something or what's the rationale behind not releasing it?

40

1

SD3.5 Large is really good at drawings ootb (i.redd.it)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/FizzarolliAI on 2024-10-23 15:40:27+00:00.

41

1

SD3.5 vs Dev vs Pro1.1 (i.redd.it)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/barepixels on 2024-10-24 03:38:57+00:00.

42

1

flux.1-lite-8B-alpha - from freepik - looks super impressive (huggingface.co)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/CeFurkan on 2024-10-23 23:55:34+00:00.

43

1

OmniGen is Out! (old.reddit.com)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/marcoc2 on 2024-10-23 23:47:55+00:00.

Installing on pinokio right now and wondering why nobody talked about it here yet

44

1

Stable Diffusion 3.5 vs SDXL 1.0 vs SD 1.6 vs SD3 Medium (www.reddit.com)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/_micah_h on 2024-10-23 22:31:34+00:00.

45

1

SD 3.5 Woman laying on the grass strikes back (old.reddit.com)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Ecstatic_Signal_1301 on 2024-10-23 20:00:15+00:00.

Prompt : shot from below, family looking down the camera and smiling, father on the right, mother on the left, boy and girl in the middle, happy family

46

1

SD3.5 produces much better variety (www.reddit.com)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/lowiqdoctor on 2024-10-23 18:38:26+00:00.

47

1

TORA Text-to-Video Workflow (old.reddit.com)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/hackerzcity on 2024-10-23 18:26:57+00:00.

What is Tora? Think of it as a smart video generator. It can take your text, pictures, and instructions (like “make a car drive on a mountain road”) and turn them into actual videos. Tora is powered by something called Diffusion Transformers.

Features of Tora

Tora’s strength comes from three key parts:

Trajectory Extractor (TE): how objects (like birds or balloons) should move in your video,
Spatial-Temporal Diffusion Transformer (ST-DiT): This part handles all the frames in the video
Motion-Guidance Fuser (MGF): this part makes sure that the movements stay natural and smooth.

Tora can make videos up to 720p with 204 frames, giving you short and long videos that look great. Older models couldn’t handle long videos as well, but Tora is next-level.

Using trajectory-guided motion, Tora ensures that objects move naturally. Whether it’s a balloon floating or a car driving, Tora makes sure it all follows the rules of real-life movement.

Resources:

:

Update this Node: https://github.com/kijai/ComfyUI-CogVideoXWrapper

Tutorials: https://www.youtube.com/watch?v=vUDqk72osfc

Workflow: https://comfyuiblog.com/comfyui-tora-text-to-video-workflow/

48

1

Finally it works! SD 3.5 (i.redd.it)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/StarShipSailer on 2024-10-23 14:52:39+00:00.

49

1

Verus Vision 1.0b (Flux model of the RealVisXL and Realistic Vison creator) (old.reddit.com)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/StableLlama on 2024-10-23 10:14:37+00:00.

The guy that brought us the great SD1.5 Realistic Vison and the SDXL RealVisXL (the test SDXL finetune according to imgsys.org) had created a Flux version RealFlux finetuned on Flux and has now released Verus Vison which was finetuned on de-distilled Flux:

My first quick test is quite promising!

Test-Prompt, seed fixed: Full body photo of a young woman with long straight black hair, blue eyes and freckles wearing a corset, tight jeans and boots standing in the garden

Verus Vision 1.0b

Flux.1 [dev]

RealFlux 1.0b (transformer_dev)

Update 1: the Flux.1[dev] image had the wrong workflow (it was generated with the Verus Vision workflow) and thus didn't show the quality of Flux base correctly. So I recreated it (also with the same seed) and exchanged it here

Update 2: The RealFlux 1.0b (transformer_dev) model also had a faulty workflow. So it's also regenerated now - and looking much, much better than the faulty one. But I'm not sure whether it's better than default Flux as the person is a bit unsharp and still looks like copy&pastes onto the background.

50

1

SDNext Release 2024-10 (old.reddit.com)

submitted 2 days ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/vmandic on 2024-10-23 14:49:37+00:00.

SD.Next Release 2024-10

A month later and with nearly 300 commits, here is the latest SD.Next update!

Workflow highlights

Reprocess: New workflow options that allow you to generate at lower quality and then reprocess at higher quality for select images only or generate without hires/refine and then reprocess with hires/refine and you can pick any previous latent from auto-captured history!
Detailer Fully built-in detailer workflow with support for all standard models
Built-in model analyzer See all details of your currently loaded model, including components, parameter count, layer count, etc.
Extract LoRA: load any LoRA(s) and play with generate as usual and once you like the results simply extract combined LoRA for future use!

New models

New integrations

Fine-tuned CLiP-ViT-L 1st stage text-encoders used by most models (SD15/SDXL/SD3/Flux/etc.) brings additional details to your images
Ctrl+X which allows for control of structure and appearance without the need for extra models
APG: Adaptive Projected Guidance for optimal guidance control
LinFusion for on-the-fly distillation of any sd15/sdxl model

What else?

Tons of work on dynamic quantization that can be applied on-the-fly during model load to any model type Supported quantization engines include: BitsAndBytes, TorchAO, Optimum.quanto, NNCF, GGUF
Auto-detection of best available device/dtype settings for your platform and GPU reduces neeed for manual configuration
Full rewrite of sampler options, not far more streamlined with tons of new options to tweak scheduler behavior
Improved LoRA detection and handling for all supported models
Several of Flux.1 optimizations and new quantization types

Oh, and we've compiled a full table with list of top-30 (how many have you tried?) popular text-to-image generative models,

their respective parameters and architecture overview: Models Overview

And there are also other goodies like multiple XYZ grid improvements, additional Flux ControlNets, additional Interrogate models, better LoRA tags support, and more...

Detailer interface

Sampler options

CLiP replacement

README | CHANGELOG | WiKi | Discord

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

Features of Tora

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

This is an automated archive made by the Lemmit Bot.

SD.Next Release 2024-10

StableDiffusion