this post was submitted on 18 Aug 2025
1129 points (99.0% liked)
Technology
74247 readers
4699 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The internet came together to define the robots file standard, it could just as easily come with a standard API for database dumps. But decided on war since the 2023 API wars and now we're going to see all the small websites die while facebook gets even more powerful.
Well there you have it. Although I still feel weird that it's somehow "the internet" that's supposed to solve a problem that's fully caused AI companies and their web crawlers.
If a crawler keeps spamming and breaking a site I see it as nothing short of a DOS attack.
Not to mention that
robots.txt
is completely voluntary and, as far as I know, mostly ignored by these companies. So then what makes you think that any them are acting in good faith?To me that is the core issue and why your position feels so outlandish. It's like having a bully at school that constantly takes your lunch and your solution being: "Just bring them a lunch as well, maybe they'll stop."
The solution is breaking intellectual property and making sharing public data easy and efficient. A top-down imposition DESIGNED to crush the giants back down to the level playing field of the small players into a system where cooperation empower the small and place the burdens on the big with the understanding that all public data is "our" data and nobody, including its custodian should get between US and IT. Something designed by actually competent and clever politicians who will anticipate and counter all the dirty tricks big tech would try to regain the upper hand. I want big tech permanently losing on a field designed to disadvantage anything that accumulates power.