this post was submitted on 21 Sep 2025
250 points (99.6% liked)
Fuck AI
4157 readers
1221 users here now
"We did it, Patrick! We made a technological breakthrough!"
A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
An LLM can probably be trained to distinguish what humans regard as "important" using an evolutionary training strategy.
If that was the case, why hasn't it been done yet?
I see three problems there:
In summary: you end up with LLMs trained at human speed instead of machine speed because of the need for a human to review and feed back the product of the LLM's training (adversarial training doesn't work here because you don't have an NN that can recognize what a domain expert thinks is properly summarized data, to train the generator with), you don't really know how much more training is needed to push it beyond it's current level of "importance" encoding (and prospects aren't good since the speed of improvement of quality of LLM output vs amount of training input has fallen steeply over time which means that it's not a linear ratio of input quantity to output quality and instead the ratio is something that very quickly grows and we're already at the steep part of the curve, needing tons more input data to yield small improvements) and last, you would need to train an LLM for each expert domain you want to support because expertise level awareness of importance of certain elements in one domain does not work in other domains, and even in LLMs for the domain which seems to be the one into which most investment in domain specific LLMs - Software Development - their capabilities are stuck at the level of a quite junior Junior Developer.
It's my understanding that this is one of the ways the DeepSeek really shines - instead of having a general one-size-fits-all model and trying to make LLMs into GenAI, they use a multitude of smaller models that can be hotswapped for different tasks in different contexts. The kind of summary you want for a news article is vastly different from the kind of summary you want for an academic paper, and being able to recognize when to use different models for different use cases is very powerful.