this post was submitted on 12 Jun 2025

22 points (62.2% liked)

Technology

73370 readers

6345 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

ChatGPT will avoid being shut down in some life-threatening scenarios, former OpenAI researcher claims (techcrunch.com)

submitted 1 month ago by MCasq_qsaCJ_234@lemmy.zip to c/technology@lemmy.world

34 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] riot@fedia.io 72 points 1 month ago (2 children)

I hate articles like this so much. ChatGPT is not sentient, it doesn't feel, it doesn't have thoughts. It has regurgitation and hallucinations.

They even had another stupid article linked about "AI blackmailing developers, when they try to turn it off." No, an LLM participates in a roleplay session that testers come up with.

It's articles like this that makes my family think that LLMs are reasoning and intelligent "beings". Fuck off.

[–] Capricorn_Geriatric@lemmy.world 12 points 1 month ago* (last edited 1 month ago) (1 children)

ChatGPT is not sentient, it doesn't feel, it doesn't have thoughts. It has regurgitation and hallucinations.

ChatGPT isn't sentient, doesn't feel or have thoughts. It has

While I agree with what you mean, I'd just like to point out that "hallucinations" is just another embellished word like the ones you critique - were AI to have real hallucinations, it would need to think and feel. Since it doesn't, its "hallucinations" are hallucinations only to us.

[–] squaresinger@lemmy.world 6 points 1 month ago (1 children)

Hallucinations mean something specific in the context of AI. It's a technical term, same as "putting an app into a sandbox" doesn't literally mean that you pour sand into your phone.

Human hallucinations and AI hallucinations are very different concepts caused by very different things.

[–] Feyd@programming.dev 3 points 1 month ago (1 children)

No it's not. Hallucinations is marketing to make the fact that llms are unreliable sound cool. Simple as

[–] squaresinger@lemmy.world 3 points 1 month ago (2 children)

Nope. Hallucinations are not a cool thing. They are a bug, not a feature. The term itself is also far from cool or positive. Or would you think it's cool if humans have hallucinations?

[–] Feyd@programming.dev 4 points 1 month ago (1 children)

I'm this very comment you are anthropomorphizing them by comparing them to humans again. This is exactly why they've chosen this specific terminology.

[–] squaresinger@lemmy.world 1 points 1 month ago* (last edited 1 month ago) (1 children)

It's not anthropomorphizing, its how new terms are created.

Pretty much every new term ever draws on already existing terms.

A car is called car, because that term was first used for streetcars before that, and for passenger train cars before that, and before that it was used for cargo train cars and before that it was used for a charriot and originally it was used for a two-wheeled Celtic war chariot. Not a lot of modern cars have two wheels and a horse.

A plane is called a plane, because it's short for airplane, which derives from aeroplane, which means the wing of an airplane and that term first denoted the shell casings of a beetle's wings. And not a lot of modern planes are actually made of beetle wing shell casings.

You can do the same for almost all modern terms. Every term derives from a term that denotes something similar, often in another domain.

Same with AI hallucinations. Nobody with half an education would think that the cause, effect and expression of AI hallucinations is the same as for humans. OpenAI doesn't feed ChatGTP hallucinogenics. It's just a technical term that means something vaguely related to what the term originally meant for humans, same as "plane" and "beetle wing shell casing".

[–] Feyd@programming.dev 2 points 1 month ago

🙄

[–] kipo@lemm.ee 2 points 1 month ago (2 children)

'Hallucinations' are not a bug though; it's working exactly as intended and this is how it's designed. There's no bug in the code that you can go in and change that will 'fix' this.

LLMs are impressive auto-complete, but sometimes the auto-complete doesn't spit out factual information because LLMs don't know what factual information is.

[–] dragonfly4933@lemmy.dbzer0.com 1 points 1 month ago

I don't think calling hallucinations a bug is strictly wrong, but it's also not working as intended. The intent is defined by the developers or the company, and they don't want hallucinations because that reduces the usefulness of the models.

I also don't think we know that it is a fact that this is a problem that can't be solved in current technology, we simply have not found any useful solution.

[–] squaresinger@lemmy.world 1 points 1 month ago

They aren't a technical bug, but an UX bug. Or would you claim that an LLM that outputs 100% non-factual hallucinations and no factual information at all is just as desirable as one that doesn't do that?

Btw, LLMs don't have any traditional code at all.

[–] Hackworth@sh.itjust.works 10 points 1 month ago

That was in Anthropic's system card for Claude 4, and the headlines/articles largely missed the point. Regarding the blackmail scenario, the paper even says:

... these extreme actions were rare and difficult to elicit, while nonetheless being more common than in earlier models. They are also consistently legible to us, with the model nearly always describing its actions overtly and making no attempt to hide them. These behaviors do not appear to reflect a tendency that is present in ordinary contexts.

They're testing alignment hacking and jail-breaking tactics in general to see how the models respond. But the greater concern is that a model will understand as part of the context that it is being tested and behave differently in testing than in deployment. This has already been an issue.

In the initial implementations of reasoning models, if an LLM was penalized directly for this kind of misaligned generation in its "scratch pad," it would not alter its misaligned response - rather it would simply omit the misaligned generation from the scratch pad. In other words, the model's actions were no longer consistently legible.