51
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/hackerzcity on 2024-10-23 18:26:57+00:00.


What is Tora? Think of it as a smart video generator. It can take your text, pictures, and instructions (like “make a car drive on a mountain road”) and turn them into actual videos. Tora is powered by something called Diffusion Transformers.

Features of Tora

Tora’s strength comes from three key parts:

  1. Trajectory Extractor (TE): how objects (like birds or balloons) should move in your video,
  2. Spatial-Temporal Diffusion Transformer (ST-DiT): This part handles all the frames in the video
  3. Motion-Guidance Fuser (MGF): this part makes sure that the movements stay natural and smooth.

Tora can make videos up to 720p with 204 frames, giving you short and long videos that look great. Older models couldn’t handle long videos as well, but Tora is next-level.

Using trajectory-guided motion, Tora ensures that objects move naturally. Whether it’s a balloon floating or a car driving, Tora makes sure it all follows the rules of real-life movement.

Resources:

:

Update this Node: https://github.com/kijai/ComfyUI-CogVideoXWrapper

Tutorials: https://www.youtube.com/watch?v=vUDqk72osfc

Workflow: https://comfyuiblog.com/comfyui-tora-text-to-video-workflow/

52
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/StarShipSailer on 2024-10-23 14:52:39+00:00.

53
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/StableLlama on 2024-10-23 10:14:37+00:00.


The guy that brought us the great SD1.5 Realistic Vison and the SDXL RealVisXL (the test SDXL finetune according to imgsys.org) had created a Flux version RealFlux finetuned on Flux and has now released Verus Vison which was finetuned on de-distilled Flux:

My first quick test is quite promising!

Test-Prompt, seed fixed: Full body photo of a young woman with long straight black hair, blue eyes and freckles wearing a corset, tight jeans and boots standing in the garden

Verus Vision 1.0b

Flux.1 [dev]

RealFlux 1.0b (transformer_dev)

Update 1: the Flux.1[dev] image had the wrong workflow (it was generated with the Verus Vision workflow) and thus didn't show the quality of Flux base correctly. So I recreated it (also with the same seed) and exchanged it here

Update 2: The RealFlux 1.0b (transformer_dev) model also had a faulty workflow. So it's also regenerated now - and looking much, much better than the faulty one. But I'm not sure whether it's better than default Flux as the person is a bit unsharp and still looks like copy&pastes onto the background.

54
1
SDNext Release 2024-10 (old.reddit.com)
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/vmandic on 2024-10-23 14:49:37+00:00.


SD.Next Release 2024-10

A month later and with nearly 300 commits, here is the latest SD.Next update!

Workflow highlights

  • Reprocess: New workflow options that allow you to generate at lower quality and then reprocess at higher quality for select images only or generate without hires/refine and then reprocess with hires/refine and you can pick any previous latent from auto-captured history!
  • Detailer Fully built-in detailer workflow with support for all standard models
  • Built-in model analyzer See all details of your currently loaded model, including components, parameter count, layer count, etc.
  • Extract LoRA: load any LoRA(s) and play with generate as usual and once you like the results simply extract combined LoRA for future use!

New models

New integrations

  • Fine-tuned CLiP-ViT-L 1st stage text-encoders used by most models (SD15/SDXL/SD3/Flux/etc.) brings additional details to your images
  • Ctrl+X which allows for control of structure and appearance without the need for extra models
  • APG: Adaptive Projected Guidance for optimal guidance control
  • LinFusion for on-the-fly distillation of any sd15/sdxl model

What else?

  • Tons of work on dynamic quantization that can be applied on-the-fly during model load to any model type Supported quantization engines include: BitsAndBytes, TorchAO, Optimum.quanto, NNCF, GGUF
  • Auto-detection of best available device/dtype settings for your platform and GPU reduces neeed for manual configuration
  • Full rewrite of sampler options, not far more streamlined with tons of new options to tweak scheduler behavior
  • Improved LoRA detection and handling for all supported models
  • Several of Flux.1 optimizations and new quantization types

Oh, and we've compiled a full table with list of top-30 (how many have you tried?) popular text-to-image generative models,

their respective parameters and architecture overview: Models Overview

And there are also other goodies like multiple XYZ grid improvements, additional Flux ControlNets, additional Interrogate models, better LoRA tags support, and more...

Detailer interface

Sampler options

CLiP replacement

CLiP replacement

README | CHANGELOG | WiKi | Discord

55
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Dazzyreil on 2024-10-23 12:21:00+00:00.

56
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/koalapon on 2024-10-23 09:21:48+00:00.

57
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Haghiri75 on 2024-10-23 08:36:19+00:00.


Greetings guys.

I'm glad to announce the new family of Mann-E models is on the way. Alongside "Dreams" project, we've been working hard on a "Flux compatible" architecture to deliver a model with the quality and coherency of Flux dev or Midjourney.

I believe I announced it before, but we've made slight changes to the business/release model for the new product!

Well, we made these changes:

  1. One open-weight model is our way to go.
  2. We came up with Mann-E Public Source License which lets you use it as much as you want in your personal setup (and mark my words, sell the output images as much as you like and I don't care. What we care about the most is the model itself) but if you want to use the model commercially, you have to be in direct affiliation with us

And also we have these policies:

Quality/Anime models (which are hosted by us and well be available on an online platform soon) are capable of "accidental NSFW", but like pretty much every corporation, we're going to have it under some filters.

But good news, open-weight (Mann-E Flux) model is completely uncensored since we intended to make it a locally used model (or if hosted, we guarantee you will have 100% freedom 😁)

If you're interested in more information or discussion about the model, checkout our Github page:

58
1
[FLUX] Thumbnail Design (www.reddit.com)
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/BigRub7079 on 2024-10-23 06:38:19+00:00.

59
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Successful_AI on 2024-10-23 05:28:13+00:00.

60
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/centuryglass on 2024-10-23 03:39:39+00:00.


I've been working on upgrading my GLID-3-XL-based inpainting software ever since Stable-Diffusion first came out, and it's gradually evolved into a full-featured image editor in its own right. I've been using it for quite a while, but I've only just reached the level of stability, documentation, and polish that would justify a public release.

As a demonstration, I've uploaded a short narrated time-lapse video of my artistic process on YouTube here: . Download links, instructions, and tutorials are on the GitHub page.

AI features:

  • Easy and fully configurable inpainting, text-to-image, and image-to-image within a movable image area, to make it as easy as possible to make arbitrarily-large images.
  • Integrated ControlNet panel, LORA selection, and access to prompt styles you've saved in the WebUI
  • AI upscaling support, including support for ControlNet tiled upscaling combined with the Ultimate SD upscaling script.
  • All AI features are powered by the API mode of Automatic1111 or Forge. If you've already installed one of those two, you won't need to deal with any more tedious Python dependency management.

Digital art and image editing features:

  • A full layer stack implementation, with support for transformations, groups, compositing and blending modes, and more.
  • An advanced and versatile brush engine with drawing tablet support, thanks to libmypaint.
  • All the usual tools you'd expect from an image editor: Text, shape creation, smudge, blur, filters, etc., all extensively documented.
61
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/lunarstudio on 2024-10-22 22:16:25+00:00.


Just received this via email and I’m taking a hard pass, not because I’m doing anything wrong but simply because it’s too easy to be wrongly flagged:

We’re writing to let you know about two recent updates to Google Cloud Platform Terms of Service which will be effective starting November 15, 2024.

We’ve provided additional information below to help you understand the specific changes implemented.

What you need to know

We added new terms and updated the following Google Cloud Platform Terms of Service:

We added terms for logging customer prompts due to potential abuse of Generative AI Services. Effective November 15, 2024, if our automated safety tools detect potential abuse of Google’s policies, we may log your prompts to review and determine if a violation has occurred. You can review the updated section to the Google Cloud Terms of Service, below. Section 4. Suspension:

“4.3 Generative AI Safety and Abuse. Google uses automated safety tools to detect abuse of Generative AI Services. Notwithstanding the “Handling of Prompts and Generated Output” section in the Service Specific Terms, if these tools detect potential abuse or violations of Google’s AUP or Prohibited Use Policy, Google may log Customer prompts solely for the purpose of reviewing and determining whether a violation has occurred. See the Abuse Monitoring page for more information about how logging prompts impacts your use of the Services.”

SecOps services are now covered by the Google Cloud Terms of Service. You do not need to agree to any additional terms of service when you purchase SecOps services. We have updated various provisions to align the Google Cloud Terms of Service with the previous SecOps terms. We added Section 14 (Resold Customers), which applies only if you purchase SecOps services through a reseller. In addition, we have made the following updates:

Section 1.4 Updates. We clarified how we notify you about updates to the Services and legal terms. We also removed outdated practices. Section 2.1 Billing. We clarified when you need to pay if you use a credit card or receive an invoice. Section 8.2 Termination for Breach. We added a section for terminating individual order forms if you breach the terms. Section 9 Publicity. We clarified that neither Google nor you can issue a press release or similar public statement about your use of the services without the other party’s permission. Note: If you sign up for SecOps after October 16, 2024, you will automatically agree to the new terms.

What you need to do

No action is required from you. This update will be effective starting November 15, 2024.

To learn more about the specific changes, please review the updated terms. You can find the previous versions at the bottom of the relevant term pages.

We’re here to help

If you have any questions related to these changes, feel free to contact us at Google Cloud Support.

Thanks for choosing Google Cloud.

62
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/BreakingGood on 2024-10-22 17:50:45+00:00.

Original Title: "Stability just needs to release a model almost as good as Flux, but undistilled with a better license" Well they did it. It has issues with limbs and fingers, but it's overall at least 80% as good as Flux, with a great license, and completely undistilled. Do you think it's enough?


I've heard many times on this sub how Stability just needs to release a model that is:

  • Almost as good as Flux
  • Undistilled, fine-tunable
  • With a good license

And they can make a big splash and take the crown again.

The model clearly has issues with limbs and fingers, but theoretically the ability to train it can address these issues. Do you think they managed it with 3.5?

63
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Puzll on 2024-10-22 17:35:40+00:00.

64
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/3deal on 2024-10-22 17:05:51+00:00.

65
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Neat_Ad_9963 on 2024-10-22 16:41:52+00:00.


Sure, Flux dev might beat SD 3.5 Large in some areas, but I think SD 3.5 Large has something special going for it - it's a base model, not a distilled one. This actually matters because base models are generally easier to work with when you're training them, unlike distilled models like Flux dev. Plus, SD 3.5 Large comes with a more flexible license, which is always a nice bonus.

66
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/BusinessFondant2379 on 2024-10-22 16:40:52+00:00.

67
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Hot_Opposite_1442 on 2024-10-22 15:32:56+00:00.

68
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/EldrichArchive on 2024-10-22 14:19:12+00:00.

69
1
Sd 3.5 Large released (old.reddit.com)
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/CesarBR_ on 2024-10-22 14:00:22+00:00.


I'll just drop it here.

70
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/lil_jimmy_norton on 2024-10-22 12:20:08+00:00.

71
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Amazing_Painter_7692 on 2024-10-22 03:57:36+00:00.

72
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Familiar-Art-6233 on 2024-10-22 00:40:17+00:00.

73
1
Bread Car (i.redd.it)
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/KacperXX on 2024-10-21 19:42:33+00:00.

74
1
[NVIDIA] Sana demo is now available (ea13ab4f5bd9c74f93.gradio.live)
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/wiserdking on 2024-10-21 22:21:02+00:00.

75
1
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/jenza1 on 2024-10-21 13:05:37+00:00.

view more: ‹ prev next ›

StableDiffusion

98 readers
1 users here now

/r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and...

founded 1 year ago
MODERATORS