384
Open Source devs say AI crawlers dominate traffic, forcing blocks on entire countries
(arstechnica.com)
This is a most excellent place for technology news and articles.
What if we start throtling them so we make them waste time? Like, we could throttle contiguous requests, so if anyone is hitting the server aggresively they'd get slowed down.
They can just interleave requests to different hosts. Honestly, someone spidering the whole Web probably should be doing that regardless.
The tricky bit is recognizing that the requests are all from the same source. Often they use different IP addresses and to even classify requests at all you have to keep extra state around that you wouldn't need without this anti-social behavior.
https://zadzmo.org/code/nepenthes/