77
Claud 3 is a bich (lemmy.world)

Claude was being judgy, so I called it out. It immediately caved. Is verbal abuse a valid method of circumventing LLM censorship??

top 25 comments
sorted by: hot top controversial new old
[-] scholar@lemmy.world 52 points 3 days ago

I love and hate that shouting at computers is now a valid troubleshooting technique

[-] SharkEatingBreakfast@sopuli.xyz 7 points 3 days ago

Verbal percussive maintenance.

[-] Alphane_Moon@lemmy.world 23 points 3 days ago

This is so strange. You would think it wouldn't be so easy to overcome the "guardrails".

And what's with the annoying faux-human response style. Their trying to "humanize" the LLM interface, but person is going to answer in this way if they believe this information should not be provided.

[-] Lumidaub@feddit.org 20 points 3 days ago

I know absolutely nothing about this, what harmful application is it trying to hide?

[-] lunar17@lemmy.world 11 points 2 days ago

The most logical chain I can think of is this: Carbon fiber is used in drone frames and missile parts -> Drones and missiles are weapons of war -> The user is a terrorist.

Of course, it is an error to ascribe "thinking" to a statistical model. The boring explanation is that there was likely some association between this topic and restricted topics in the training data. But that can be harder for people to conceptualize.

[-] OmegaLemmy@discuss.online 3 points 2 days ago

Some ai models do have 'thinking' where they use your prompt to first generate a description use and what not for it to better generate the rest of the content (it's hidden from users)

That might've lead Claude to saying 'fuck no, most common uses is in military?' and shut you down

[-] skillissuer@discuss.tchncs.de 7 points 3 days ago

aluminum is much easier to machine and carbon fibre is also expensive with only benefit being low weight

[-] disk42@lemmy.world 12 points 3 days ago

Or submarines

[-] froztbyte@awful.systems 10 points 3 days ago

the casual undertone of “hmm is assault okay when the thing I anthropomorphised isn’t really alive?” in your comment made me cringe so hard I nearly dropped my phone

pls step away from the keyboard and have a bit of a think about things (incl. whether you think it’s okay to inflict that sort of shit on people around you, nevermind people you barely know)

While I think I get OP's point, I'm also reminded of our thread a few months back where I advised being polite to the machines just to build the habit of being respectful in the role of the person making a request.

If nothing else you can't guarantee that your request won't be deemed tricky enough to deliver to a wildly underpaid person somewhere in the global south.

[-] V0ldek@awful.systems 1 points 1 day ago* (last edited 1 day ago)

Dunno, I disagree. It's quite impossible for me to put myself in the shoes of a person who wouldn't see a difference between shouting at an INANIMATE FUCKIN' OBJECT vs at an actual person. As if saying "fuck off" to ChatGPT made me somehow more likely to then say "fuck off" to a waiter in a restaurant? That's sociopath shit. If you need to "built the habit of being respectful" you have some deeper issues that should be solved by therapy, not by being nice to autocomplete.

I'm a programmer since forever, I spend roughly 4h every day verbally abusing the C++ compiler because it's godawful and can suck my balls. Doesn't make me any more likely to then go to my colleague and verbally abuse them since, you know, they're an actual person and I have empathy for them. If anything it's therapeutic for me since I can vent some of my anger at a thing that doesn't care. It's like an equivalent of shouting into a pillow.

[-] YourNetworkIsHaunted@awful.systems 1 points 18 hours ago

See, I feel like the one thing that Generative AI has been able to do consistently is to fool even some otherwise-reasonable people into thinking that there's something like a person they're talking to. One of the most toxic impacts that it's had on online discourse and human-computer interactions in general is by introducing ambiguity into whether there's a person on the other end of the line. On one hand, we need to wonder whether other posters on even this forum will Disregard All Previous Instructions. On the other hand, it's a known fact that a lot of these "AI" tools are making heavy use of AGI technologies - A Guy in India. Before the bubble properly picked up my wife got contracted to work for a company that claimed to offer an AI personal assistant. Her job would have literally been to be the customer's remote-working personal assistant. I like to think that her report to the regulators may have been part of what inspired these grifts to look internationally for their exploitable labor. I don't think I need to get into the more recent examples here of all forums.

Obviously yelling at your compiler isn't going to lead to being an asshole to actual people any more than smashing a keyboard or cursing after missing a nail with a hammer. And to be fair most of the posters here (other than the drive-thrus) aren't exactly lacking in class consciousness or human decency or whatever you want to call it, so I'm probably preaching to the choir. But I do think there's a risk that injecting that ambiguity into the incidental relations we have with other people through our technologies (e.g. the chat window with tech support that could be a bot or a real agent depending on the stage of the conversation) is going to degrade the working conditions for a lot of real people, and the best way to avoid that is to set the norm that it's better to be polite to the robot if it's going to pretend to be a person.

[-] Akrenion@slrpnk.net 5 points 3 days ago

There was no question of morality. The question was whether it worked. If we do not want violent speech to be the norm we should check that our tools do not encourage it and are protected against this exploit.

[-] froztbyte@awful.systems 11 points 3 days ago* (last edited 3 days ago)

“our tools” says the poster, speaking of the non-consensually built plagiarism machine powering abuses

which “our” is that? does the boot require a lickee?

[-] Akrenion@slrpnk.net 4 points 3 days ago

You are making assumptions about my stance on AI. I was making a general statement about tools. You insult me. You said that OP should maybe step away from the keyboard and think about whether it was fine to subject people to violence. I suggest you do the same.

[-] self@awful.systems 6 points 2 days ago

You are making assumptions about my stance on AI. I was making a general statement about tools.

since apparently you decided to post about fucking nothing, you can take your pointless horseshit elsewhere

[-] froztbyte@awful.systems 8 points 3 days ago

methinks the poster doth protest too much

[-] Pulptastic@midwest.social 1 points 2 days ago

You just made the list.

[-] Pieisawesome@lemmy.world 1 points 3 days ago

Yes. Abuse towards LLMs works.

My team has shared prompts and about 50% of them threaten some sort of harm

[-] lunar17@lemmy.world 8 points 2 days ago

Yikes. I knew this tech would introduce new societal issues, but I can't say this is one I foresaw.

[-] Silic0n_Alph4@lemmy.world -3 points 3 days ago

Treat ‘em mean, keep ‘em keen.

Listen son, ‘n’ listen’ close. If it flies, floats, or computes, rent it.

[-] Earflap@reddthat.com -2 points 3 days ago* (last edited 3 days ago)

Interesting. I like Claude but its so sensitive and usually when it censors itself I can't get it to answer the question even if I try and explain that it has misunderstood my prompt.

"I'm sorry, I don't feel comfortable generating sample math formula test questions whose answer is 42 even if you're just going to use it in documentation that won't be administered to students."

Fuck you Claude! Just answer the god damn question!

[-] lunar17@lemmy.world 3 points 2 days ago

A tool that isn't useful isn't a tool at all!

this post was submitted on 07 Jan 2025
77 points (96.4% liked)

TechTakes

1512 readers
252 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago
MODERATORS