this post was submitted on 19 Aug 2025
864 points (99.3% liked)

Technology

74438 readers
2373 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
(page 5) 43 comments
sorted by: hot top controversial new old
[–] Glitchvid@lemmy.world 254 points 1 week ago (41 children)

When a firm outright admits to bypassing or trying to bypass measures taken to keep them out, you think that would be a slam dunk case of unauthorized access under the CFAA with felony enhancements.

load more comments (41 replies)
[–] floquant@lemmy.dbzer0.com 245 points 1 week ago (4 children)

It's difficult to be a shittier company than OpenAI, but Perplexity seems to be trying hard.

load more comments (4 replies)
[–] JeeBaiChow@lemmy.world 94 points 1 week ago
[–] sylver_dragon@lemmy.world 51 points 1 week ago (3 children)

You'd think that a competent technology company, with their own AI would be able to figure out a way to spoof Cloudflare's checks. I'd still think that.

[–] spankmonkey@lemmy.world 69 points 1 week ago* (last edited 1 week ago) (3 children)

Or find a more efficient way to manage data, since their current approach is basically DDOSing the internet for training data and also for responding to user interactions.

load more comments (3 replies)
[–] Quill7513@slrpnk.net 32 points 1 week ago

see, but they're not competent. further, they don't care. most of these ai companies are snake oil. they're selling you a solution that doesn't meaningfully solve a problem. their main way of surviving is saying "this is what it can do now, just imagine what it can do if you invest money in my company."

they're scammers, the lot of them, running ponzi schemes with our money. if the planet dies for it, that's no concern of theirs. ponzi schemes require the schemer to have no long term plan, just a line of credit that they can keep drawing from until they skip town before the tax collector comes

[–] lemmyng@piefed.ca 21 points 1 week ago

Perplexity: "But that would cost us moneeyyyy!"

[–] Ekybio@lemmy.world 20 points 1 week ago (3 children)

Can someone with more knowledge shine a bit more light on this while situation? Im out of the loop on the technical details

[–] spankmonkey@lemmy.world 56 points 1 week ago

AI crawlers tend to overwhelm websites by doing the least efficient scraping of data possible, basically DDOSing a huge portion of the internet. Perplexity already scraped the net for training data and is now hammering it inefficiently for searches.

Cloudflare is just trying to keep the bots from overwhelming everything.

[–] panda_abyss@lemmy.ca 33 points 1 week ago* (last edited 1 week ago) (4 children)

Cloudflare runs as a CDN/cache/gateway service in front of a ton of websites. Their service is to help protect against DDOS and malicious traffic.

A few weeks ago cloudflare announced they were going to block AI crawling (good, in my opinion). However they also added a paid service that these AI crawlers can use, so it actually becomes a revenue source for them.

This is a response to that from Perplexity who run an AI search company. I don’t actually know how their service works, but they were specifically called out in the announcement and Cloudflare accused them of “stealth scraping” and ignoring robots.txt and other things.

[–] very_well_lost@lemmy.world 32 points 1 week ago* (last edited 1 week ago) (3 children)

A few weeks ago cloudflare announced they were going to block AI crawling (good, in my opinion). However they also added a paid service that these AI crawlers can use, so it actually becomes a revenue source for them.

I think it's also worth pointing out that all of the big AI companies are currently burning through cash at an absolutely astonishing rate, and none of them are anywhere close to being profitable. So pay-walling the data they use is probably gonna be pretty painful for their already-tortured bottom line (good).

load more comments (3 replies)
[–] RogueBanana@piefed.zip 4 points 1 week ago

But the website owner can still choose to continue blocking them right? Without using additional stuff like Anubis that is.

load more comments (2 replies)
[–] BetaDoggo_@lemmy.world 22 points 1 week ago* (last edited 1 week ago) (11 children)

Perplexity (an "AI search engine" company with 500 million in funding) can't bypass cloudflare's anti-bot checks. For each search Perplexity scrapes the top results and summarizes them for the user. Cloudflare intentionally blocks perplexity's scrapers because they ignore robots.txt and mimic real users to get around cloudflare's blocking features. Perplexity argues that their scraping is acceptable because it's user initiated.

Personally I think cloudflare is in the right here. The scraped sites get 0 revenue from Perplexity searches (unless the user decides to go through the sources section and click the links) and Perplexity's scraping is unnecessarily traffic intensive since they don't cache the scraped data.

load more comments (11 replies)
[–] Ermiar@lemmy.world 19 points 1 week ago* (last edited 1 week ago) (1 children)
load more comments (1 replies)
[–] BaroqueInMind@piefed.social 19 points 1 week ago

Cry more, Perplexity.

[–] interdimensionalmeme@lemmy.ml 9 points 1 week ago (1 children)
[–] _cryptagion@lemmy.dbzer0.com 0 points 6 days ago (5 children)

The anti-AI shield and bot-fight mode are free, you don't need to pay anything to use them.

load more comments (5 replies)
[–] dzajew@piefed.social 7 points 1 week ago

Cry me a river

[–] xxce2AAb@feddit.dk -2 points 1 week ago

Ooh, that's though sweetheart. If the owners of those servers want you to visit, they'll just choose another WAF than CF's.

All zero of them.

load more comments
view more: ‹ prev next ›