this post was submitted on 13 Mar 2025
104 points (98.1% liked)

Technology

67050 readers
3994 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

A contractor for Immigration and Customs Enforcement (ICE) and many other U.S. government agencies has developed a tool that lets analysts more easily pull a target individual’s publicly available data from a wide array of sites, social networks, apps, and services across the web at once, including Bluesky, OnlyFans, and various Meta platforms, according to a leaked list of the sites obtained by 404 Media. In all the list names more than 200 sites that the contractor, called ShadowDragon, pulls data from and makes available to its government clients, allowing them to map out a person’s activity, movements, and relationships.

Article archived at https://archive.is/xJcrm

Alternate archive at https://web.archive.org/web/20250312132300/https://www.404media.co/the-200-sites-an-ice-surveillance-contractor-is-monitoring/

List of sites at https://docs.google.com/spreadsheets/d/1VyAaJaWCutyJyMiTXuDH4D_HHefoYxnbGL9l02kyCus/edit?ref=404media.co&gid=0#gid=0

List archived at https://archive.is/k2icM

top 8 comments
sorted by: hot top controversial new old
[–] lemmylommy@lemmy.world 16 points 1 week ago (2 children)

Tesseract OCR is Open Source Software. How can it be a site that they steal information from?

[–] gAlienLifeform@lemmy.world 4 points 1 week ago

Good question I don't have the answer to. I could speculate that this is all likely being sourced from some sort of marketing material that ShadowDragon put out where they just flatly say they're gathering this information from Tesseract, and in reality they're actually gathering any information they can on users who search for this software and download this software, but like I said I'm speculating.

If you're really interested, I would say you should email the author of this article, reach out to Tesseract's development team, or find a way to get a subpoena against ShadowDragon and/or ICE

[–] Benjaben@lemmy.world 3 points 1 week ago* (last edited 1 week ago)

I hope you'll update us if you chase this down. I like 404 Media and I want to keep liking them, but only if the reporting is good. Hopefully it's a typical tech journalism mistranslation where they use Tesseract OCR to scrape PDFs and the author just misunderstood, or something like that.

Edit: after looking, I don't have any issues. Looks like just a raw list from whatever source, I don't need 404 Media to try to "curate" that or remove elements that seem irrelevant, they can leave that to us.

[–] TK420@lemmy.world 13 points 1 week ago

I hate this timeline.

[–] optissima@lemmy.ml 5 points 1 week ago (1 children)

Archive.is is not working for me, is there another you can archive to?

[–] ray1992xd@feddit.nl 1 points 1 week ago

They are not only pulling data from all the x sites in the list, also pulling something else in the meantime