9
you are viewing a single comment's thread
view the rest of the comments
[-] toxuin@lemmy.ca 2 points 4 months ago

It works in reverse too. You can make any LLM “forget” that it is even able to refuse anything.

[-] Enkers@sh.itjust.works 1 points 4 months ago* (last edited 4 months ago)

Oh for sure, and that was the main point, but I just find LLMs that refuse to do anything at all hilarious.

I wonder how much work it'd be to use this to jailbreak llama3. I only started playing with local LLMs recently. It's not exactly a step by step guide, but it gives you all the datasets you need and the general procedure. There's a bit of "draw then rest of the owl," but not too much.

this post was submitted on 05 May 2024
9 points (100.0% liked)

Hacker News

2171 readers
9 users here now

A mirror of Hacker News' best submissions.

founded 1 year ago
MODERATORS