this post was submitted on 25 Aug 2025

172 points (98.3% liked)

Fuck AI

3818 readers

584 users here now

"We did it, Patrick! We made a technological breakthrough!"

A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.

founded 1 year ago

MODERATORS

VerbFlow@lemmy.world

MrMcGasion@lemmy.world

TootSweet@lemmy.world

BigMikeInAustin@lemmy.world

cynar@lemmy.world

drmeanfeel@lemmy.world

pavnilschanda@lemmy.world

CriticalMedicine@lemmy.world

WonderfulWanderer@lemmy.world

Communist@lemmy.ml

eatCasserole@lemmy.world

SpaceNoodle@lemmy.world

NutWrench@lemmy.world

Soup@lemmy.cafe

iAvicenna@lemmy.world

Tinks@lemmy.world

wizblizz@lemmy.world

corus_kt@lemmy.world

Prandom_returns@lemm.ee

JimSamtanko@lemm.ee

TrickDacy@lemmy.world

TheFriar@lemm.ee

ArmokGoB@lemmy.dbzer0.com

HawlSera@lemm.ee

andrew_bidlaw@sh.itjust.works

MeDuViNoX@sh.itjust.works

33550336@lemmy.world

Nougat@fedia.io

Lost_My_Mind@lemmy.world

Sterile_Technique@lemmy.world

Quill7513@slrpnk.net

glowing_hans@sopuli.xyz

e8d79@discuss.tchncs.de

ThefuzzyFurryComrade@pawb.social

172

Top AI models fail spectacularly when faced with slightly altered medical questions (jamanetwork.com)

submitted 1 day ago by Pro@programming.dev to c/fuck_ai@lemmy.world

19 comments fedilink hide all child comments

cross-posted from: https://programming.dev/post/36289727

Comments

Reddit.

Our findings reveal a robustness gap for LLMs in medical reasoning, demonstrating that evaluating these systems requires looking beyond standard accuracy metrics to assess their true reasoning capabilities.6 When forced to reason beyond familiar answer patterns, all models demonstrate declines in accuracy, challenging claims of artificial intelligence’s readiness for autonomous clinical deployment.

A system dropping from 80% to 42% accuracy when confronted with a pattern disruption would be unreliable in clinical settings, where novel presentations are common. The results suggest that these systems are more brittle than their benchmark scores suggest.

you are viewing a single comment's thread
view the rest of the comments

[–] Peanutbjelly@sopuli.xyz -3 points 17 hours ago* (last edited 17 hours ago) (1 children)

cats also suck at analogies and metaphors, but they still have intelligence.

a rock could not accurately interpret and carry out complex adjustments to a document. LLMs can.

if the rock was... travelling through complex information channels and high-dimensional concept spaces to interpret the text i gave it, and accurately performed the requested task being represented within those words, yeah it might be a little intelligent.

but i don't know any stones that can do that.

or are you referring to the 'stochastic parrot' argument which tries to demonize confabulatory properties of the model, as if humans don't have and use confabulatory processes?

just because we have different tools we use along-side of those confabulatory processes does not mean we are literally the opposite.

or just find some people to be loud with you so you can ignore the context or presented dissonance. this is really popular with certain groups of 'intelligent' humans, which i often lovingly refer to as "cults," which never have to spend energy thinking about the world, cause they can just confabulate their own shared idea of what the world is, and ignore anyone trying to bring that annoying dissonance into view.

also humans are really not that amazingly 'intelligent' depending on the context. especially those grown in an environment that does not express a challenging diversity of views from which to collectively reduce shared dissonance.

if people understood this, maybe we could deal with things like the double empathy problem. but the same social-confirmation modes ensure minority views don't get heard, and the dissonance is just signal that we should collectively get mad at until it's quiet again.

isn't that so intelligent of humanity?

but no, let's all react with aggression to all dissonance that appears, like a body that intelligently recognizes the threat of peanuts, and kills itself. (fun fact, cellular systems are great viewed in this lens. see tufts university and michael levin for some of the coolest empirical results i've ever seen in biology.

we need to work together and learn from our shared different perspectives, without giving up to a paperclip maximizing social confirmation bubble, confabulating a detached delusion into social 'reality.'

to do this, understanding the complex points i'm trying to talk about is very important.

compressing meaning into language is hard when the interpreting models want to confabulate their own version that makes sense, but excludes any of your actual points, and disables further cooperative communication.

i can make great examples, but it doesn't go far if people don't have any knowledge of

-current sociology

-current neuro-psych

-current machine learning

-current biology

-cults and confirmation bubbles, and how they co-confirm their own reality within confabulated complexity.

-why am i trying so hard, nobody is actually reading this, they are just going to skim it and downvote me because my response wasn't "LLMS BAD, LLMS DUMB!"

-i'm tired.

-i appreciate all of you regardless, i just want people to deal with more uncomfortable dissonance around the subject before having such strong opinions.

[+] zeropointone@lemmy.world 0 points 16 hours ago* (last edited 1 minute ago) (1 children)

[deleted]

[–] Peanutbjelly@sopuli.xyz 2 points 15 hours ago

it sure as hell shouldn't be making any important choices unilaterally.

and people actively using it for things like... face recognition, knowing it has bias issues leading to false-flagging for people with certain skin tone, should probably be behind bars.

although that stuff often feels more intentional, like the failure is an 'excuse' to keep using it. see 'mind-reading' tactics that have the same bias issues but still get officially sanctioned for use. (there's a good rabbit hole there)

it's also important to note that supporters of AI generally have had to deal with moving goalposts.

like... if linux fixed every problem being complained about, but the fact that something else was missing is now the reason linux is terrible, as if their original issue was just an excuse to hate on linux.

both issues of fanboys and haters are bad, and those who want to address reality, continue to improve linux, while recognizing and addressing the problems have to deal with both of those tribes attacking them for either not believing in the linux god, or not believing in the linux devil.

weirdly, actually understanding intelligent systems is a good way to deal with that issue, but unless you people are willing to accept new information that isn't just blind tribal affirmation, they will continue to maximize paperclips, like a paperclip maximizer for whatever momentum is socially salient. tribal war and such.

i just want to... not ignore any part of the reality. be it the really cool new tools^ (see genie 3, which resembles what haters have been saying is literally impossible for a long time)^ but also recognizing the environment we live in. (google is pretty evil, rich people are taking over, and modern sciences have a much better framing of the larger picture that is important for us to socially spread.)

really appreciate your take!