18
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 30 Dec 2024
18 points (100.0% liked)
TechTakes
1512 readers
252 users here now
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
founded 2 years ago
MODERATORS
An interesting thing came through the arXiv-o-tube this evening: "The Illusion-Illusion: Vision Language Models See Illusions Where There are None".
It's definitely linked in with the problem we have with LLMs where they detect the context surrounding a common puzzle rather than actually doing any logical analysis. In the image case I'd be very curious to see the control experiment where you ask "which of these two lines is bigger?" and then feed it a photograph of a dog rather than two lines of any length. I'm reminded of how it was (is?)easy to trick chatGPT into nonsensical solutions to any situation involving crossing a river because it pattern-matched to the chicken/fox/grain puzzle rather than considering the actual facts being presented.
Also now that I type it out I think there's a framing issue with that entire illusion since the question presumes that one of the two is bigger. But that's neither here nor there.
I disagree, or rather I think that's actually a feature; "neither" is a perfectly reasonable answer to that question that a human being would give, and LLMs would be fucked by since they basically never go against the prompt.