this post was submitted on 03 Sep 2025
125 points (96.3% liked)

Open Source

40459 readers
486 users here now

All about open source! Feel free to ask questions, and share news, and interesting stuff!

Useful Links

Rules

Related Communities

Community icon from opensource.org, but we are not affiliated with them.

founded 6 years ago
MODERATORS
 

First "modern and powerful" open source LLM?

Key features

  • Fully open model: open weights + open data + full training details including all data and training recipes
  • Massively Multilingual: 1811 natively supported languages
  • Compliant Apertus is trained while respecting opt-out consent of data owners (even retrospectivey), and avoiding memorization of training data
you are viewing a single comment's thread
view the rest of the comments
[–] lime@feddit.nu 14 points 4 days ago* (last edited 4 days ago)

that's the problem with deletion requests, the data isn't in there. it can't be, from a purely mathematical standpoint. statistically, with the amount of stuff that goes into training, any full work included in an llm is represented by less than one bit. but the model just... remakes sensitive information from scratch. ih reconstructs infringing data based on patterns.

which of course highlights the big issue with data anonymization: it can't really be done.