1
This week in SD - all the major developments in a nutshell
(old.reddit.com)
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/OkSpot3819 on 2024-09-15 09:12:34+00:00.
- FLUX Updates: Performance improvements using torch.compile() for 53.88% speedup on high-end GPUs. Optimization techniques for running FLUX on low-end GPUs like GTX 1060 6GB.
- Quantization Comparison: Comprehensive comparison of different quantization levels for FLUX.1, balancing model size, VRAM usage, and output quality.
- Layer Fine-tuning: Technique for fine-tuning specific layers in FLUX for faster training and inference while maintaining quality.
- FLUX Fast Mode: Comparison of FLUX's --fast mode testing on RTX 4090 GPU, focusing on speed, quality, and LoRA likeness degradation.
- Remote Photography Service: Workflow for creating highly accurate AI-generated portraits using LoRA training on client photos with FLUX.
- FLUX Text Processing: Overview of how FLUX processes text prompts using both CLIP and T5 models for improved prompt interpretation.
⚓ Links, context, visuals for the section above ⚓
- James Earl Jones' AI Voice Legacy: Jones signed over rights to his Darth Vader voice to Lucasfilm, allowing AI recreation using Respeecher technology.
- PS5 Pro Announcement: New console features AI-driven upscaling technology called PlayStation Spectral Super Resolution (PSSR).
- AI Workflow: Image to 3D Scan: Novel workflow for converting AI-generated 2D face images into detailed 3D scans using multiple techniques.
- ComfyUI 3D Pack: Portable Windows version of ComfyUI with pre-installed 3D Pack for easier setup.
- Playbook Beta: Enables 3D scene data streaming with ComfyUI for real-time manipulation and visualization.
- CogVideoX Progress: Developers add code to improve prompts for upcoming Image-to-Video functionality.
- PuLID for FLUX: Release of PuLID-FLUX-v0.9.0 model for tuning-free ID customization in FLUX.1-dev.
- FLUX.1-dev-Controlnet-Inpainting-Alpha: New inpainting ControlNet checkpoint for the FLUX.1-dev model.
- ComfyUI Layer Style Plugin: Adds Photoshop-like layer and mask compositing functionality to ComfyUI.
- 3D Arena: Community-driven leaderboard for evaluating generative 3D models.
- Zero123++: Open-source 3D generative AI model for multi-view image generation from single images.
- GameGen-O: Tencent's AI model for open-world video game generation.
- HeyGen Avatar 3.0: Update allows for dynamic generation of facial expressions, body-motion, and voice intonation based on script content.
- FineVideo Dataset: Hugging Face releases dataset for advanced video understanding and analysis.
- Fluxgym Update: Adds automatic sample image generation and custom resolution support for FLUX LoRA training.
- RobustSAM: New model improving on Meta's Segment Anything Model for degraded images.
- Concept Sliders: Technique for precise control in image generation/editing with diffusion models.
- Runaway Gen-3 Alpha Video to Video: New control mechanism for precise movement and expressiveness in video generation.
⚓ Links, context, visuals for the section above ⚓
- FLUX LoRA Showcase: Golden Haggadah, Amateur Photography [Flux Dev], Anti-Blur, Filmfotos, JWST Deep Space, Topcraft Watercolor, Dark Fantasy, Soviet Era Mosaic, 80s Fisher Price, Playstation 2