501
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/ZootAllures9111 on 2024-09-20 02:27:39+00:00.

Original Title: FYI if you're using something like JoyCaption to caption images: Kohya does not support actual newline characters between paragraphs, it stops parsing the file after the first one it hits, your caption text needs to be separated only by spaces between words (meaning just one long paragraph)


I noticed this was the case a while ago, figured I'd point it out. You can confirm it by comparing metadata in a Lora file to captions that had newlines, any text after one for a given image simply won't be present in that metadata.

502
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/flyingdickins on 2024-09-19 22:37:03+00:00.

503
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/an303042 on 2024-09-19 21:05:45+00:00.

504
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/371830 on 2024-09-19 17:53:26+00:00.


After using Flux for over a month now, I'm curious what's your combo for best image quality? As I started local image generation only last month (occasional MJ user before), it's pretty much constant learning. One of the things that took me time to realize is that not just selection of the model itself is important, but also all the other bits like clip, te, sampler etc. so I thought I'll share this, maybe other newbies find it useful.

Here is my current best quality setup (photorealistic). I have 24GB, but I think it will work on 16 GB vram.

  • flux1-dev-Q8_0.gguf

  • clip: ViT-L-14-TEXT-detail-improved-hiT-GmP-TE-only-HF.safetensors - until last week I didn't even know you can use different clips. This one made big difference for me and works better than ViT-L-14-BEST-smooth. Thanks u/zer0int1

  • te: t5-v1_1-xxl-encoder-Q8_0.gguf - not sure if it makes any difference vs t5xxl_fp8_e4m3fn.safetensors

  • vae: ae.safetensors - don't remember where I got this one from

  • sampling: Forge Flux Realistic - best results from few sampling methods I tested in forge

  • scheduler: simple

  • sampling steps: 20

  • DCFG 2-2.5 - with PAG below enabled it seems I can bump up DCFG higher before the skin starts to look unnatural

  • Perturbed Attention Guidance: 3 - this adds about 40% inference time, but I see clear improvement in prompt adherence and overall consistency so always keep it on. When going above 5 the images start looking unnatural.

  • Other optional settings in forge did not give me any convincing improvements, so don't use them.

505
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/simpleuserhere on 2024-09-19 15:30:21+00:00.

506
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/SmaugPool on 2024-09-19 14:10:52+00:00.

507
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Patient-Librarian-33 on 2024-09-19 12:20:16+00:00.

508
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/tintwotin on 2024-09-19 09:56:48+00:00.


Image to Video for CogVideoX-5b implemented in diffuserslib by zRdianjiao and Aryan V S has now been added to the free and open-source Blender VSE add-on: Pallaidium.

509
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/JBOOGZEE on 2024-09-19 08:56:14+00:00.

510
1
Elektroschutz⚡ LoRA (www.reddit.com)
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/an303042 on 2024-09-19 08:15:52+00:00.

511
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/wonderflex on 2024-09-19 06:27:25+00:00.

512
1
Hszd-workflow-icon (www.reddit.com)
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/zazaoo19 on 2024-09-19 03:36:44+00:00.

513
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Pultti4 on 2024-09-19 02:33:07+00:00.

514
1
landscape. (i.redd.it)
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/EcoPeakPulse on 2024-09-19 02:27:02+00:00.

515
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Junior_Economics7502 on 2024-09-18 19:54:38+00:00.

516
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Angrypenguinpng on 2024-09-18 22:00:39+00:00.

517
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/ScarletEnthusiast on 2024-09-18 17:21:57+00:00.

518
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Old_Reach4779 on 2024-09-18 16:18:06+00:00.


Hugging face:

Hugging face space:

Github:

Comfyui node: (kijai just inserted i2v example workflow 😍)

License: Apache-2.0 license !

519
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/jerrydavos on 2024-09-18 16:16:21+00:00.

520
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Old_Reach4779 on 2024-09-18 11:55:02+00:00.


source: https://github.com/THUDM/CogVideo/tree/CogVideoX_dev

edit2:

they released it!

edit:

Hugging face model just released! link

(still github main branch is not merged)

Today will be a long loooong loooooooooooooooooooooong day!

521
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/stableee on 2024-09-18 11:51:24+00:00.

522
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/hkunzhe on 2024-09-18 11:39:09+00:00.


Alibaba PAI have been using the EasyAnimate framework to fine-tune CogVideoX and open-sourced CogVideoX-Fun, which includes both 5B and 2B models. Compared to the original CogVideoX, we have added the I2V and V2V functionality and support for video generation at any resolution from 256x256x49 to 1024x1024x49.

HF Space:

Code:

ComfyUI node:

Models: &

Discord: 

523
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/an303042 on 2024-09-18 10:25:17+00:00.

524
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Kinda-Brazy on 2024-09-18 09:15:50+00:00.

525
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/oodelay on 2024-09-18 01:56:52+00:00.

view more: ‹ prev next ›

StableDiffusion

98 readers
1 users here now

/r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and...

founded 1 year ago
MODERATORS