this post was submitted on 21 Mar 2025
211 points (99.1% liked)

Linux

6683 readers
327 users here now

A community for everything relating to the GNU/Linux operating system

Also check out:

Original icon base courtesy of lewing@isc.tamu.edu and The GIMP

founded 2 years ago
MODERATORS
 

LLM scrapers are taking down FOSS projects' infrastructure, and it's getting worse.

you are viewing a single comment's thread
view the rest of the comments
[–] refalo@programming.dev 2 points 4 days ago (2 children)

What you’re doing is filtering out bots that can’t be bothered to execute JavaScript. You don’t need to do a computational heavy PoW task to do that.

Most bots and scrapers from what I've seen already are using (headless) full browsers, and hence are executing javascript, so I think anything that slows them down or increases their cost can reduce the traffic they bring.

Canvas fingerprinting filters out bots better than PoW

Source? I strongly disagree, and it's not hard to change your browser characteristics to get a new canvas fingerprint every time, some browsers like firefox even have built-in options for it.

[–] sudo@programming.dev 1 points 1 day ago

Most bots and scrapers from what I’ve seen already are using (headless) full browsers

That's not going to be the majority of your bot traffic by a long shot because it doesn't scale like using basic HTTP requests.

This is from personal experience. With PoW you just need any puppetted browser, maybe less. With Canvas finerprinting you need a heavily customized scraping browser, either one you made yourself or one you're paying for. If that's the case the cost of PoW is neglible. If you still want actual stats, I'd have to ask where you're getting any stats on PoW working.

[–] YetiSkotch@ieji.de 1 points 4 days ago (1 children)

@refalo @sudo If Proof of Work gets widely adopted I foresee a future where bot running data-centers can out-compute humans to visit sites, while old devices of users in poorer countries struggle to compute the required task for hours … Or is that fear misguided?

[–] sudo@programming.dev 1 points 1 day ago (1 children)

Admins will always turn down the bot management when it starts blocking end users. At that point you cough up the money for the extra bandwidth and investigate different solutions.

[–] YetiSkotch@ieji.de 1 points 1 day ago* (last edited 1 day ago)

@sudo yeah, the bot-problem is hard, especially for voluntary that help others.

https://nadeko.net/announcements/invidious-and-the-bot-problem/

* they use a proof of work system called #Anubis to fix their #bot problem. I hope it works. #proofofwork

The proof of work right now needs about 1 second on my phone, so I am happy with that.

Perhaps the biggest problem of bots is the number of requests they start, which is impossible to replicate by a normal human clicking on buttons.