overview for diz

Gemini 2.5 "reasoning", no real improvement on river crossings. in c/techtakes@awful.systems

[–] diz@awful.systems 7 points 2 months ago* (last edited 2 months ago)

Not really. Here's the chain-of-word-vomit that led to the answers:

Note that in "its impossible" answer it correctly echoes that you can take one other item with you, and does not bring the duck back (while the old overfitted gpt4 obsessively brought items back), while in the duck + 3 vegetables variant, it has a correct answer in the wordvomit, but not being an AI enthusiast it can't actually choose the correct answer (a problem shared with the monkeys on typewriters).

I'd say it clearly isn't ignoring the prompt or differences from the original river crossings. It just can't actually reason, and the problem requires a modicum of reasoning, much as unloading groceries from a car does.

Gemini 2.5 "reasoning", no real improvement on river crossings. in c/techtakes@awful.systems

[–] diz@awful.systems 6 points 2 months ago* (last edited 2 months ago) (2 children)

It’s a failure mode that comes from pattern matching without actual reasoning.

Exactly. Also looking at its chain-of-wordvomit (which apparently I can't share other than by cut and pasting it somewhere), I don't think this is the same as GPT 4 overfitting to the original river crossing and always bringing items back needlessly.

Note also that in one example it discusses moving the duck and another item across the river (so "up to two other items" works); it is not ignoring the prompt, and it isn't even trying to bring anything back. And its answer (calling it impossible) has nothing to do with the original.

In the other one it does bring items back, it tries different orders, even finds an order that actually works (with two unnecessary moves), but because it isn't an AI fanboy reading tea leaves, it still gives out the wrong answer.

Here's the full logs:

https://pastebin.com/HQUExXkX

Content warning: AI wordvomit which is so bad it is folded hidden in a google tool.

Gemini 2.5 "reasoning", no real improvement on river crossings. in c/techtakes@awful.systems

[–] diz@awful.systems 9 points 2 months ago* (last edited 2 months ago) (23 children)

Yeah, exactly. There's no trick to it at all, unlike the original puzzle.

I also tested OpenAI's offerings a few months back with similarly nonsensical results: https://awful.systems/post/1769506

All-vegetables no duck variant is solved correctly now, but I doubt it is due to improved reasoning as such, I think they may have augmented the training data with some variants of the river crossing. The river crossing is one of the top most known puzzles, and various people have been posting hilarious bot failures with variants of it. So it wouldn't be unexpected that their training data augmentation has river crossing variants.

Of course, there's very many ways in which the puzzle can be modified, and their augmentation would only cover obvious stuff like variation on what items can be left with what items or spots on the boat.

OpenAI whistleblower found dead at 26 in San Francisco apartment | TechCrunch in c/techtakes@awful.systems

[–] diz@awful.systems 3 points 6 months ago* (last edited 6 months ago)

Full time AI grift jobs would of course be forever closed to any AI whistleblower. There's still a plenty of other jobs.

I did participate in the hiring process, I can tell you that at your typical huge corporation the recruiter / HR are too inept to notice that you are a whistleblower, and don't give a shit anyway. And of the rank and file who will actually google you, plenty enough people dislike AI.

At the rank and file level, the only folks who actually give a shit who you are are people who will have to work with you. Not the background check provider, not the recruiter.

MIT review selling a horrifying dystopia where an AI will monitor your rectum 24/7 and you repair your own fridge using AR glasses and haptics or something in c/techtakes@awful.systems

[–] diz@awful.systems 6 points 8 months ago

Well the OP talks about a fridge.

I think if anything it's even worse for tiny things with tiny screws.

What kind of floating hologram is there gonna be that's of any use, for something that has no schematic and the closest you have to a repair manual is some guy filming themselves taking apart some related product once?

It looks cool in a movie because it's a 20 second clip in which one connector gets plugged, and tens of person hours were spent on it by very talented people who know how to set up a scene that looks good and not just visually noisy.

MIT review selling a horrifying dystopia where an AI will monitor your rectum 24/7 and you repair your own fridge using AR glasses and haptics or something in c/techtakes@awful.systems

[–] diz@awful.systems 4 points 8 months ago

but often the video isn’t clear or fine quality enough

Wouldn't it be great if 100x the effort that didn't go into making the video clear or fine quality enough, instead didn't go into making relevant flying, see-through overlay decals?

Ultimately the reason it looks cool is that you're comparing a situation of little effort being put into repair related documentation, to some movie scenario where 20 person-hours were spent making a 20-second repair fragment whereby 1 step of a repair is done.

MIT review selling a horrifying dystopia where an AI will monitor your rectum 24/7 and you repair your own fridge using AR glasses and haptics or something in c/techtakes@awful.systems

[–] diz@awful.systems 4 points 8 months ago

I'm not sure it's actually being used, beyond C suite wanting something cool to happen and pretending it did happen.

MIT review selling a horrifying dystopia where an AI will monitor your rectum 24/7 and you repair your own fridge using AR glasses and haptics or something in c/techtakes@awful.systems

[–] diz@awful.systems 6 points 8 months ago (1 children)

Exactly. It goes something like "remember when you were fixing a washing machine and you didn't know what some part was and there was no good guide for fixing it, no schematic, no nothing? Wouldn't it be awesome if 100x of the work that wasn't put into making documentation was not put into making VR overlays?

The Nobel Prize in physics goes to Geoffrey Hinton for his work in computer science. What? in c/techtakes@awful.systems

[–] diz@awful.systems 3 points 8 months ago* (last edited 8 months ago)

Using tools from physics to create something that is popular but unrelated to physics is enough for the nobel prize in physics?

If only, it's not even that! Neither Boltzmann machines nor Hopfield networks led to anything used in the modern spam and deepfake generating AI, nor in image recognition AI, or the like. This is the kind of stuff that struggles to get above 60% accuracy on MNIST (hand written digits).

Hinton went on to do some different stuff based on backpropagation and gradient descent, on newer computers than those who came up with it long before him, and so he got Turing Award for that, and it's a wee bit controversial because of the whole "people doing it before, but on worse computers, and so they didn't get any award" thing, but at least it is for work that is on the path leading to modern AI and not for work that is part of the vast list of things that just didn't work and it's extremely hard to explain why you would even think they would work in the first place.

The Nobel Prize in physics goes to Geoffrey Hinton for his work in computer science. What? in c/techtakes@awful.systems

[–] diz@awful.systems 3 points 8 months ago

Then next year Hopfield and Hinton go back to Sweden, don't tell king of Sweden anything, king of Sweden still gives them the Nobel Prize! King of Sweden now has conditioned reflex!

MIT review selling a horrifying dystopia where an AI will monitor your rectum 24/7 and you repair your own fridge using AR glasses and haptics or something in c/techtakes@awful.systems

[–] diz@awful.systems 13 points 8 months ago (22 children)

I seriously wonder, do any of the folks with the "AR glasses to assist repair" thing ever actually repair anything, or do they get their ideas of how you repair stuff from computer games?

The Nobel Prize in physics goes to Geoffrey Hinton for his work in computer science. What? in c/techtakes@awful.systems

[–] diz@awful.systems 8 points 8 months ago* (last edited 8 months ago)

Nobel prize in Physics for attempting to use physics in AI but it didn't really work very well and then one of the guys working on a better more pure mathematics approach that actually worked and got the Turing Award for the latter, but that's not what the prize is for, while the other guy did some other work, but that is not what the prize is for. AI will solve all physics!!!111