33
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 14 Jul 2024
33 points (100.0% liked)
TechTakes
1401 readers
185 users here now
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
founded 1 year ago
MODERATORS
It blocks at least Wget and Curl, but works for other unusual UA strings like "Hello".
As of 2023 this was because of a default AWS firewall rule: https://www.lesswrong.com/posts/gidrFxE5hdQWCrXxn/why-is-lesswrong-blocking-wget-and-curl-scrape?commentId=jzyz4sZ82bw2MgZNW
Speaking more generally, Wget's recursive crawl can cause problems if run with inadequate rate limiting. e.g. here's what wikipedia's robots.txt says: