this post was submitted on 26 Jul 2023
2113 points (98.6% liked)

Technology

72212 readers
3871 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] corstian@lemmy.world 32 points 2 years ago (6 children)

Am I the only one thinking these trust tokens are not going to prevent bots from scraping websites?

Eventually, somewhere, someone will just develop the infrastructure to work their way around this, right?

[–] diyrebel@lemmy.dbzer0.com 10 points 2 years ago* (last edited 2 years ago) (5 children)

It would stop beneficial bots like the ones I create¹ as a small-time hobbyist because the little guy does not have the resources for this arms race. You may be right when it comes to large-scale scraping ops that are done by a business (e.g. scraping RyanAir or Southwest airlines so an airfare consolidation site can show more fares).

① e.g. I wrote a bot that scraped the real estate market sites, scraped the public transport sites, and found me a house with the shortest public transport commute.

[–] heimchen@discuss.tchncs.de 1 points 2 years ago (1 children)

How would this impact selenium based bots?

[–] diyrebel@lemmy.dbzer0.com 1 points 2 years ago

Perhaps not at all.

But the limitation of using #Selenium is a big one. Being forced to work in java, forced to use the resource hog of a modern gui browser, forced to reveal more browserprint info, being browser-dependent, etc. Selenium is my last choice when desperation is sufficiently high.

load more comments (3 replies)
load more comments (3 replies)