9
what about data scrapers? (programming.dev)

reddit and twitter (suposedly) jacked up their API prices because of data scrapers, what could lemmy to do try to stop them?

i dont think we can do anything

you are viewing a single comment's thread
view the rest of the comments
[-] ruffsl@programming.dev 7 points 1 year ago

Yeah, I actually think lemmy could benefit from improved scraping and indexing. For example, it'd really help if more search engines could natively understand Lemmy's federated nature, e.g:

  • Deduplicate links by prioritising results for instances hosting the community that a post was originally submitted to.
  • Include and denote cross posts by recognizing order of submission timestamp and prioritizing popularity via vote ratios, comment counts, and lurker click-through traffic.
  • Do the same deduplication and prioritization across instances, but for comments as well.

Another use case besides search engines would be for internet archive projects, helping to preserve historic internet content even in the face of lemmy instances falling offline and disappearing. For example, much knowledge was lost to us due to the Twitter APIoplicips and Reddit Blackout: E.g:

Most of the above will only ever be possible due to improved scraping or even federation APIs.

this post was submitted on 08 Jul 2023
9 points (84.6% liked)

Programming.dev Meta

2365 readers
1 users here now

Welcome to the Programming.Dev meta community!

This is a community for discussing things about programming.dev itself. Things like announcements, site help posts, site questions, etc. are all welcome here.

Links

Credits

founded 1 year ago
MODERATORS