view the rest of the comments
World News
A community for discussing events around the World
Rules:
-
Rule 1: posts have the following requirements:
- Post news articles only
- Video links are NOT articles and will be removed.
- Title must match the article headline
- Not United States Internal News
- Recent (Past 30 Days)
- Screenshots/links to other social media sites (Twitter/X/Facebook/Youtube/reddit, etc.) are explicitly forbidden, as are link shorteners.
-
Rule 2: Do not copy the entire article into your post. The key points in 1-2 paragraphs is allowed (even encouraged!), but large segments of articles posted in the body will result in the post being removed. If you have to stop and think "Is this fair use?", it probably isn't. Archive links, especially the ones created on link submission, are absolutely allowed but those that avoid paywalls are not.
-
Rule 3: Opinions articles, or Articles based on misinformation/propaganda may be removed. Sources that have a Low or Very Low factual reporting rating or MBFC Credibility Rating may be removed.
-
Rule 4: Posts or comments that are homophobic, transphobic, racist, sexist, anti-religious, or ableist will be removed. “Ironic” prejudice is just prejudiced.
-
Posts and comments must abide by the lemmy.world terms of service UPDATED AS OF 10/19
-
Rule 5: Keep it civil. It's OK to say the subject of an article is behaving like a (pejorative, pejorative). It's NOT OK to say another USER is (pejorative). Strong language is fine, just not directed at other members. Engage in good-faith and with respect! This includes accusing another user of being a bot or paid actor. Trolling is uncivil and is grounds for removal and/or a community ban.
Similarly, if you see posts along these lines, do not engage. Report them, block them, and live a happier life than they do. We see too many slapfights that boil down to "Mom! He's bugging me!" and "I'm not touching you!" Going forward, slapfights will result in removed comments and temp bans to cool off.
-
Rule 6: Memes, spam, other low effort posting, reposts, misinformation, advocating violence, off-topic, trolling, offensive, regarding the moderators or meta in content may be removed at any time.
-
Rule 7: We didn't USED to need a rule about how many posts one could make in a day, then someone posted NINETEEN articles in a single day. Not comments, FULL ARTICLES. If you're posting more than say, 10 or so, consider going outside and touching grass. We reserve the right to limit over-posting so a single user does not dominate the front page.
We ask that the users report any comment or post that violate the rules, to use critical thinking when reading, posting or commenting. Users that post off-topic spam, advocate violence, have multiple comments or posts removed, weaponize reports or violate the code of conduct will be banned.
All posts and comments will be reviewed on a case-by-case basis. This means that some content that violates the rules may be allowed, while other content that does not violate the rules may be removed. The moderators retain the right to remove any content and ban users.
Lemmy World Partners
News !news@lemmy.world
Politics !politics@lemmy.world
World Politics !globalpolitics@lemmy.world
Recommendations
For Firefox users, there is media bias / propaganda / fact check plugin.
https://addons.mozilla.org/en-US/firefox/addon/media-bias-fact-check/
- Consider including the article’s mediabiasfactcheck.com/ link
Where does the training data come from to create indecent images of children?
It doesn't need csam data for training, it just needs to know what a boob looks like, and what a child looks like. I run some sdxl-based models at home and I've observed it can be difficult to avoid more often than you'd think. There are keywords in porn that blend the lines across datasets ("teen", "petite", "young", "small" etc). The word "girl" in particular I've found that if you add that to basically any porn prompt gives you a small chance of inadvertently creating the undesirable. You have to be really careful and use words like "woman", "adult", etc instead to convince your image model not to make things that look like children. If you've ever wondered why internet-based porn generators are on super heavy guardrails, this is why.
It is true, a 10 year old naked woman is just a 30 year old naked woman scaled down by 40%. /s
No buddy, there isn't some vector of "this is the distance between kid and adult" that a model can apply to generate what a hypothetical child looks like. The base model was almost certainly trained on more than just anatomical drawings from Wikipedia - it ate some csam.
If you've seen stuff about "Hitler - Germany + Italy = Mousillini" for models where that's true (which is not universal) it takes an awful lot of training data to establish and strengthen those vectors. Unless the generated images were comically inaccurate then a lot of training went into this too.
Bro googled the word vector and was waiting to use it.
No, they's referring to the internal workings of AI models, which are essentially a series of incredibly high-dimension matrices with extra bits around them to make them work. Individual concepts are embedded as vectors in the space that these models work in. That's why linear algebra is brought up so frequently in discussions of AI.
While it's true that linear algebra and vectors are used in learning models, they're not using the term correctly in a way that says they know something about the subject (at least, the modern subject). Concepts aren't embedded as vectors. In older models (before the craze), concepts were manually embedded as numbers or a collection of numbers, which could be a vector (but could be something else as well), and the machine would learn by modifying weights. However, in current models (and by current, I mean at least more than a couple years), concepts are learnt by the machine (weights are still modified by the machine as well) and the machine makes its own connections between features presented to it.
For example, you give it a dataset of 10x10 pixel images (with text descriptions) and it reads that as 100 pixels split into 3 numbers (RGB) and then looks for connections between those numbers and in which pixels. It's not identifying what a boob is, but knows that when an image has 'boob' in the text description then there's a very high likelihood that there will be a circular collection of pixels with lots of red somewhere in the image that are also connected to other pixels that are often also lots of red. That's me breaking down what a human would think given the same task/information, but the reality is the machine will come up with its own connections/concepts which are both often far better than humans (when the model works, at least) and far more ineffable to humans.
From my perspective as an algebraist, you seem to be splitting hairs when you're making a distinction between vectors and n-tuples of real numbers. Furthermore, he's referencing a specific 3blue1brown video. I'm not saying their conclusion is correct; they's dead wrong but that doesn't mean their understanding is so shallow that they're simply repeating a word they heard to sound smart.
Here is an alternative Piped link(s):
specific 3blue1brown video.
Piped is a privacy-respecting open-source alternative frontend to YouTube.
I'm open-source; check me out at GitHub.