Most of the internet was already BS before 'working' LLMs, where do you think the models learned it from? I think what you want is a crap detector, and I'm with you. Any ideas good ideas and I'll donate my time to work on it.
For me it's uBlacklist with my personal list in front and some github page I found after.
FYI Kagi has an integrated Blocker/Upranker/Downranker similar to this. Under their stats page you can see, which domains have been blockes/raised/... the most.
The most hated one by far: Pinterest and all locale-specific sub-domains.
That's only for google though.
Oh cool!
That's the reason why ai search engines like bing are so bad, it's based on top results that are the same crap.
I think at some point we will have to introduce human confirmation from creator side.
I don't mind someone using chatgpt as a tool to write better articles, but most of internet is sensles bs.
Unfortunately, even OpenAI themselves took down their AI detection tool because it was too inaccurate. It's really, REALLY hard to detect AI writing with current technology, so any such addon would probably need to use a master list of articles that are manually flagged by human.
If you could detect AI authored stuff, couldn't you use that to train your LLM?
Suspect it would operate more on the basis of a person confirming that the article is of reasonable quality & accuracy.
So not unlike editors selecting what to publish, what to reject & what to send back for improvements.
If good articles by AI get accepted & poor articles by people get rejected, there may still be impacts, but at face value it might be sufficient for us seeking to read stuff.
It could be used to create a reward model like what is done right now with RLHF.
That said, it should actually be possible to make a bullshit detector that detects bullshit writing.
It's not possible to create 100% reliable ML-generated content detection
I don't even know of any that are 75% reliable. It's a really hard problem.
wasn't openai's ai detector like 25% accurate? at that point its just random chance mostly
I wrote a detector that is 50% accurate. It just flips a coin.
Marxist-Leninists cant reliably detect content D:
I know there's GPT Zero. I personally don't trust it at all, but you could still look into it
Have got fairly good at spotting these from the first few lines, but it would be nice to not bother clicking on them in the first place & better again if they didn't clog up my search results.
Back when it was just humans churning out rubbish, there was far less of it in the way of good information, but it helped enormously that search engines still respected operands.
Bringing that back would likely help far more than a detector extension.
I dunno how to follow this in lemmy...
you can't follow it but you can save it with the star icon and come back to it later.
You can use the remind me bot
@Remindme@programming.dev 5 hours
Or whatever timeframe you prefer
Note the remindme bot uses an allowlist and this community isnt in it, youd have to get your community mods to request it gets added in the repository if you want to use it here
Why does it use an allowlist? Seems like its fine to just run across lemmy since it only appears when summoned.
Bot guidelines for some of the major instances dont allow bot posting unless its been approved by a mod. Also makes more sense for mods to choose what bots to allow in their community rather than response bots being fully allowed everywhere since that can easily get out of hand if a bunch get made
moar A .I.
Another thought: does it really matter if it's AI generated or not? As long as you can fact-check the content and the quality isn't horrible, I don't see why it matters if it's written by a real person or not
Firefox
A place to discuss the news and latest developments on the open-source browser Firefox