172
Top AI models fail spectacularly when faced with slightly altered medical questions
(jamanetwork.com)
"We did it, Patrick! We made a technological breakthrough!"
A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.
I agree, but also being able to reconstruct content like that has some intelligence.
That being said, when you have no way of telling what is in-sample vs out-of-sample, and what might be correct or convincing gibberish, you should never rely on these tools.
The only time I really find them useful is with tools and RAG, where they can filter piles of content and then route me to useful parts faster.
Can a colander suggest novel new math proofs?
Because LLMs can and have.
That’s not an accurate characterization
There are LLMs trained on brute forced sets of lemmas, which then are able to predict new ones, and there are “regular” models evaluated on math that are able to create new theorems based on prompting plus their latent parameters.