overview for Cavendish

More from the island of Guarma [Album] by Cavendish in c/aigen@lemmynsfw.com

[-] Cavendish@lemmynsfw.com 5 points 11 months ago* (last edited 11 months ago)

No controlnet or inpainting. Everything was generated in one go with a single prompt. I'll sometimes use regional prompts to set zones for head and torso (usually top 40% is where the head goes, bottom 60% for torso/outfit). But even when I have regional prompting turned off, it will still generate a 3/4 / cowboy shot.

I assume you pulled the prompt out of one of my images? If not, you can feed them into pngchunk.com. Here's the general format I use with regional prompting:

*scene setting stuff*
ADDCOMM
*head / hair description*
ADDROW
*torso/body/pose*

The loras that are in the top (common) section are weighted pretty low, 0.2 - 0.3, because they get repeated/multiplied in each of the two regional rows. So I think at the end they're effectively around 0.6 - 0.8.

prompt example

photo of a young 21yo (Barbadian Barbados dark skin:1.2) woman confident pose, arms folded behind back, poised and assured outside (place cav_rdrguarma:1.1),
(Photograph with film grain, 8K, RAW DSLR photo, f1.2, shallow depth of field, 85mm lens),
masterwork, best quality, soft shadow
 (soft light, color grading:0.4)

ADDCOMM

sunset beach with ocean and mountains and cliff ruin in the background ,
(amethyst with violet undertones hair color in a curly layers style:1.2),
 perfect eyes, perfect skin, detailed skin

ADDROW

choker ,
(pea green whimsical unicorn print bikini set:1.1) (topless:1.3) cameltoe (undressing, panty pull:1.4) 
(flat breast, normal_nipples :1.4),
(tan lines, beauty marks:0.6)
(SkinHairDetail:0.8)

It may be that you're not describing the clothing / body enough? My outfit prompts are pretty detailed, so I think that goes a long way for Stable Diffusion to determine how to frame things.

More animation experiments by Cavendish in c/aigen@lemmynsfw.com

[-] Cavendish@lemmynsfw.com 6 points 11 months ago

I've tried using less than 75 tokens (literally just "woman on beach wearing dress") and they weren't coming out much different stability wise than my 300+ token monstrosity prompts that lets me play OCD with fabric patterns and hair length and everything else. So I'm not sure why my experience differs so much from the conventional advice. I think the majority of the jumping is from the dynamic prompts. Here's one that didn't change the prompt per-frame (warning: hands!) and it's much more stable: https://files.catbox.moe/rgjbem.mp4. There's definitely a million knobs to fiddle with in these settings, and it's all changing every day anyway, so it's hard to keep up!

More animation experiments by Cavendish in c/aigen@lemmynsfw.com

[-] Cavendish@lemmynsfw.com 5 points 1 year ago

That's just the nature of Stable Diffusion. I didn't prompt anything about eye color, so the models fall back onto internal biases. On average blonde hair = blue eyes, and brown hair = brown eyes.

I Fell In Love With An Emo Girl by Cavendish in c/aigen@lemmynsfw.com

[-] Cavendish@lemmynsfw.com 6 points 1 year ago

You can get a lot of interesting pose variety by messing with the aspect ratio. See also regional prompting to carve out spaces within the larger frame. I find putting head/hair/face prompts in their own region, then scaling that region, to be extremely effective in controlling close-up to wide shot framing.

Beautiful mature lady wearing stockings by Cavendish in c/aigen@lemmynsfw.com

[-] Cavendish@lemmynsfw.com 5 points 1 year ago

Post it again, and screw the downvotes. I thought that image was pretty good and am kicking myself now for not saying so before you took it down. Please don't lurk, this community needs more diversity.

Photoshoot in Colter +LoRA [Album] by Cavendish in c/aigen@lemmynsfw.com

[-] Cavendish@lemmynsfw.com 5 points 1 year ago

There's not much out there on training LoRAs that aren't anime characters, and that just isn't my thing. I don't know a chibi from a booru, and most of those tutorials sound like gibberish to me. So I'm kind of just pushing buttons and seeing what happens over lots of iterations.

For this, I settled on the class of place. I tried location but it gave me strange results, like lots of pictures of maps, and GPS type screens. I didn't use any regularization images. Like you mentioned, i couldn't think of what to use. I think the regularization would be more useful in face training anyway.

I read that a batch size of one gave more detailed results, so I set it there and never changed it. I also didn't use any repeats since I had 161 images.

I did carefully tag each photo with a caption .txt file using Utilities > BLIP Captioning in Kohya_ss. That improved results over the versions I made with no tags. Results improved again dramatically when I went back and manually cleaned up the captions to be more consistent. For instance, consolidating building, structure, barn, church, house all to just cabin.

Epochs was 150, which gave me 24,150 steps. Is that high or low? I have no idea. They say 2000 steps or so for a face, and a full location is way more complex than a single face... It seems to work, but it took me 8 different versions to get a model I was happy with.

Let me know what ends up working for you. I'd love to have more discussions about this stuff. As a reward for reading this far, here's a sneak peek at my next lora based on RDR2's Guarma island. https://files.catbox.moe/w1jdya.png. Still a work in progress.

Photoshoot in Colter +LoRA [Album] by Cavendish in c/aigen@lemmynsfw.com

[-] Cavendish@lemmynsfw.com 5 points 1 year ago

AckbarItsATrap.gif

Garden [Album] by Cavendish in c/aigen@lemmynsfw.com

[-] Cavendish@lemmynsfw.com 5 points 1 year ago

She's cute! I like the pose variety.

Myrtle Beach - Dusk [Album] by Cavendish in c/aigen@lemmynsfw.com

[-] Cavendish@lemmynsfw.com 4 points 1 year ago* (last edited 1 year ago)

Feels like cheating. I have high hopes for the newer 1024x1024 SDXL models

Myrtle Beach - Dusk [Album] by Cavendish in c/aigen@lemmynsfw.com

[-] Cavendish@lemmynsfw.com 4 points 1 year ago

Thanks! I haven't quite got the hang of inpainting yet, it's on the list for nipple adjustments. I try and bury the hands in sand/hair/clothes though, those things are always nasty to deal with.

Myrtle Beach - Day [Album] by Cavendish in c/aigen@lemmynsfw.com

[-] Cavendish@lemmynsfw.com 4 points 1 year ago* (last edited 1 year ago)

Are you using the Memmy app? Memmy seems to have a hard time with linked image thumbnails when they're posted with markdown reference links. I do it this way to link to the original PNGs. If I upload to lemmy directly, it strips the prompt metadata.

For what it's worth, the page renders correctly in the browser, and in my preferred iOS app, Bean.

Myrtle Beach - Dawn [Album] by Cavendish in c/aigen@lemmynsfw.com

[-] Cavendish@lemmynsfw.com 4 points 1 year ago

From what I've seen, the same seed will pick the same wildcard value from the dynamic prompt text files. There may be a setting to change this behavior. The result would be the same as changing the prompt manually though, and re-generating with the same seed.