this post was submitted on 30 Jun 2024
155 points (87.8% liked)
Videos
14271 readers
144 users here now
For sharing interesting videos from around the Web!
Rules
- Videos only
- Follow the global Mastodon.World rules and the Lemmy.World TOS while posting and commenting.
- Don't be a jerk
- No advertising
- No political videos, post those to !politicalvideos@lemmy.world instead.
- Avoid clickbait titles. (Tip: Use dearrow)
- Link directly to the video source and not for example an embedded video in an article or tracked sharing link.
- Duplicate posts may be removed
Note: bans may apply to both !videos@lemmy.world and !politicalvideos@lemmy.world
founded 1 year ago
MODERATORS
I would counter that there are many good use cases that go beyond the scope of what was mentioned in the video (his concerns are absolutely legitimate).
For example:
Nvidia's DLSS for gamers. This provides a decent boost to FPS while maintaining a good quality picture. They use multiple models such as motion prediction, interpreting between the frames what the image should look like, and upscaling. These models are (most likely) trained on the video games themselves which is why you want to get the latest driver updates because they include updates to those models. And, yes, the upscaling and interpolation models here are generative models as they are filling in frames with new pictures with details that aren't there from the source, and then enlarging the picture and filling in details in a way that traditional means of upscaling cannot.
Brainstorming/writer's block:
For generative text models, I think these have to be used carefully, and treated as if they're interns that have a knowledge in a very broad range of subjects. They're great for brainstorming ideas and for writer's block, but their output needs to be verified for accuracy and the output shouldn't be trusted or used directly in most cases.
Entertainment:
They're also excellent for entertainment purposes, for example, check out this GLaDOS project:
https://old.reddit.com/r/LocalLLaMA/comments/1csnexs/local_glados_now_running_on_windows_11_rtx_2060/
Which is combining a generative text (LLM) model with a generative audio (text to speech) model as well as a few other models.
Green screen tools:
We could use the sodium vapor process to create training material for a model that can quickly and accurately handle processing green screens for video production:
https://www.youtube.com/watch?v=UQuIVsNzqDk
Creating avatars for user accounts on websites.
Creating interesting QR codes that actually work:
https://civitai.com/models/111006/qr-code-monster
So, in the end, I think that there are some incredible uses for generative AI that go beyond just "creating garbage fast", that don't cause problems in the way that this video is describing (and those problems he describes are definitely valid).
Sure, I'll give you that stuff like that is pretty cool, pretty niche though.
Don't like it for this purely because LLMs are pretty much cultural poison.
If you are entertained by this stuff for longer than 15 minutes you can try jangling keys in front of your face as a cheaper alternative.
Yea if you like to be represented by an ugly plagiarised blob.
No opinion on how viable or useful this may be as I don't really use green screens.
So basically, the pros of generative models are stuff that makes you go "cool I guess..." for 15 minutes and the cons are creativity and cool shit on the internet being drowned out by an oversaturation of idiots trying to make a quick buck.