this post was submitted on 14 Aug 2025
776 points (98.6% liked)

Technology

74055 readers
5155 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
top 50 comments
sorted by: hot top controversial new old
[–] kalleboo@lemmy.world 15 points 10 hours ago

They literally don't know. "GPT-5" is several models, with a model gating in front to choose which model to use depending on how "hard" it thinks the question is. They've already been tweaking the front-end to change how it cuts over. They've definitely going to keep changing it.

[–] MrSmith@lemmy.world 4 points 8 hours ago

Pump it Sammy, pump it harder!!

[–] Transtronaut@lemmy.blahaj.zone 19 points 12 hours ago

If anyone has ever wondered what it would look like if tech giants went all in on "brute force" programming, this is it. This is what it looks like.

[–] Saledovil@sh.itjust.works 58 points 21 hours ago (1 children)

It's safe to assume that any metric they don't disclose is quite damning to them. Plus, these guys don't really care about the environmental impact, or what us tree-hugging environmentalists think. I'm assuming the only group they are scared of upsetting right now is investors. The thing is, even if you don't care about the environment, the problem with LLMs is how poorly they scale.

An important concept when evaluating how something scales is are marginal values, chiefly marginal utility and marginal expenses. Marginal utility is how much utility do you get if you get one more unit of whatever. Marginal expenses is how much it costs to get one more unit. And what the LLMs produce is the probably that a token, T, follows on prefix Q. So P(T|Q) (read: Probably of T, given Q). This is done for all known tokens, and then based on these probabilities, one token is chosen at random. This token is then appended to the prefix, and the process repeats, until the LLM produces a sequence which indicates that it's done talking.

If we now imagine the best possible LLM, then the calculated value for P(T|Q) would be the actual value. However, it's worth noting that this already displays a limitation of LLMs. Namely even if we use this ideal LLM, we're just a few bad dice rolls away from saying something dumb, which then pollutes the context. And the larger we make the LLM, the closer its results get to the actual value. A potential way to measure this precision would be by subtracting P(T|Q) from P_calc(T|Q), and counting the leading zeroes, essentially counting the number of digits we got right. Now, the thing is that each additional digit only provides a tenth of the utility to than the digit before it. While the cost for additional digits goes up exponentially.

So, exponentially decaying marginal utility meets exponentially growing marginal expenses. Which is really bad for companies that try to market LLMs.

[–] Jeremyward@lemmy.world 12 points 20 hours ago (5 children)

Well I mean also that they kinda suck, I feel like I spend more time debugging AI code than I get working code.

[–] SkunkWorkz@lemmy.world 10 points 18 hours ago (1 children)

I only use it if I’m stuck even if the AI code is wrong it often pushes me in the right direction to find the correct solution for my problem. Like pair programming but a bit shitty.

The best way to use these LLMs with coding is to never use the generated code directly and atomize your problem into smaller questions you ask to the LLM.

[–] fibojoly@sh.itjust.works 7 points 14 hours ago (1 children)

So duck programming right?

[–] Knock_Knock_Lemmy_In@lemmy.world 4 points 13 hours ago

And fancier intellisense

[–] Sl00k@programming.dev -1 points 9 hours ago

Do you use Claude Code? It's the only time I've had 90%+ success rate.

[–] Sl00k@programming.dev 0 points 9 hours ago (1 children)

Do you use Claude Code? It's the only time I've had 90%+ success rate.

[–] Jeremyward@lemmy.world 2 points 8 hours ago

I have, and it doesn't at least not on the dev-ops stuff I work on.

[–] Sl00k@programming.dev -2 points 9 hours ago

Do you use Claude Code? It's the only time I've had 90%+ success rate.

load more comments (1 replies)
[–] devfuuu@lemmy.world 3 points 12 hours ago

How can anyone look at that face and trust anything that mad man could have to say.

[–] Tollana1234567@lemmy.today 24 points 21 hours ago (7 children)

intense electricity demands, and WATER for cooling.

load more comments (7 replies)
[–] fuzzywombat@lemmy.world 45 points 1 day ago (3 children)

Sam Altman has gone into PR and hype overdrive lately. He is practically everywhere trying to distract the media from seeing the truth about LLM. GPT-5 has basically proved that we've hit a wall and the belief that LLM will just scale linearly with amount of training data is false. He knows AI bubble is bursting and he is scared.

[–] rozodru@lemmy.world 8 points 15 hours ago (2 children)

Bingo. If you routinely use LLM's/AI you've recently seen it first hand. ALL of them have become noticeably worse over the past few months. Even if simply using it as a basic tool, it's worse. Claude for all the praise it receives has also gotten worse. I've noticed it starting to forget context or constantly contradicting itself. even Claude Code.

The release of GPT5 is proof in the pudding that a wall has been hit and the bubble is bursting. There's nothing left to train on and all the LLM's have been consuming each others waste as a result. I've talked about it on here several times already due to my work but companies are also seeing this. They're scrambling to undo the fuck up of using AI to build their stuff, None of what they used it to build scales. None of it. And you go on Linkedin and see all the techbros desperately trying to hype the mounds of shit that remain.

I don't know what's next for AI but this current generation of it is dying. It didn't work.

[–] BluesF@lemmy.world 4 points 13 hours ago

I was initially impressed by the 'reasoning' features of LLMs, but most recently ChatGPT gave me a response to a question in which it stated five or six possible answers sparated by "oh, but that can't be right, so it must be...", and none of them was right lmao. Thought for like 30 seconds to give me a selection of wrong answers!

[–] Tja@programming.dev 2 points 13 hours ago

Any studies about this "getting worse" or just anecdotes? I do routinely use them and I feel they are getting better (my workplace uses Google suite so I have access to gemini). Just last week it helped me debug an ipv6 ra problem that I couldn't crack, and I learned a few useful commands on the way.

[–] Saledovil@sh.itjust.works 11 points 21 hours ago

He's also already admitted that they're out of training data. If you've wondered why a lot more websites will run some sort of verification when you connect, it's because there's a desperate scramble to get more training data.

load more comments (1 replies)
[–] TheObviousSolution@lemmy.ca 11 points 20 hours ago

When you want to create the shiniest honeypot, you need high power consumption.

[–] redsunrise@programming.dev 292 points 1 day ago (15 children)

Obviously it's higher. If it was any lower, they would've made a huge announcement out of it to prove they're better than the competition.

load more comments (15 replies)
[–] daveB@sh.itjust.works 46 points 1 day ago (3 children)
load more comments (3 replies)
load more comments
view more: next ›