this post was submitted on 14 Aug 2025
469 points (98.4% liked)

Technology

74098 readers
2800 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
top 50 comments
sorted by: hot top controversial new old
[–] aarmea@lemmy.world 8 points 1 day ago (1 children)
[–] trk@aussie.zone 6 points 23 hours ago (1 children)

Is there a more ignored file on the internet than robots.txt?

[–] ohshit604@sh.itjust.works 5 points 22 hours ago

Do dnt headers count?

[–] isVeryLoud@lemmy.ca 2 points 23 hours ago* (last edited 23 hours ago)

What does 11ty stand for, besides "eleventy"? I'm expecting something like a11y, where there are 11 characters in between.

Also, someone crash course me about the difference between 11ty and regular SSR?

[–] AnarchistArtificer@slrpnk.net 32 points 1 day ago (2 children)

Something I love about this piece is that it being written by a person who cares deeply about stuff means that I now have a positive opinion towards the two places linked as being good places for recipes ([https://www.theguardian.com/profile/meera-sodha](http://www.meera.com/ Sodha) and Smitten Kitchen). I'm going to promptly forget about them, because I'm not the kind of cook who uses recipes, but still, it's striking to me how transferable caring about stuff is. I don't know the author of this blog, but based on this post (and the zippity-fast speed that their website loads), I'm positively inclined towards them, because I am a silly human, and that means I am a deeply social creature.

[–] ruplicant@sh.itjust.works 2 points 1 day ago

and I love you

[–] wildwhitehorses@aussie.zone 2 points 1 day ago (1 children)

Ooh I have an amazing cheese scone recipe from The Guardian

[–] needanke@feddit.org 2 points 1 day ago

You cannot just say that and then not link it!

[–] xthexder@l.sw0.com 30 points 2 days ago (1 children)

This site loaded so quickly it actually surprised me. I swear I've got apps on my phone that can't even switch views faster than this site loads uncached. That's impressive.

[–] GreenShimada@lemmy.world 20 points 1 day ago (1 children)

This is how the internet used to be, for the most part. There's no boatload of 27 JS and 15 CSS files to reference. There's no batch of 110mb splash SVGs to load so I scroll down past 3 words and see 7 stock images before getting 1 sentence of information. It's probably a 200kb site with a few 300kb images to load as well.

This is the work of an enlightened being.

Images might be even less, it looks like a fair amount of the site is inline SVG.

[–] powerofm@lemmy.ca 104 points 2 days ago (1 children)

localghost is an amazing domain name

[–] Ek-Hou-Van-Braai@piefed.social 32 points 2 days ago* (last edited 2 days ago) (1 children)

Was just thinking the same

I wonder what is up with http://localghost.com/ it looks like some treasure hunt

[–] ExcessShiv@lemmy.dbzer0.com 16 points 2 days ago (1 children)

The number series changes with every refresh...I'm intrigued

[–] MalReynolds@piefed.social 18 points 2 days ago

Perhaps a poisoning the well attack on AI scrapers?

[–] moseschrute@lemmy.world 13 points 2 days ago

This website is really pretty. Design goals

[–] GEEXiES@lemmy.world 46 points 2 days ago (3 children)

Message aside, the site is cool, love that you can change the style, and the icon animation on the last one is brilliant. Also: a webring! It's been a long time since I saw one. I need more of this web and I'm happy to rediscover it.

[–] ohshit604@sh.itjust.works 19 points 1 day ago* (last edited 1 day ago) (1 children)
[–] GEEXiES@lemmy.world 3 points 1 day ago

WTF??? That's amazing! Thank you for wasting my time -a lot of it- in the best possible way ;)

[–] mfed1122@discuss.tchncs.de 11 points 1 day ago (1 children)

Def check out neocities more, also melonking.net and his projects are pretty cool, particularly https://melonland.net/surf-club has a lot of good sites on it

[–] GEEXiES@lemmy.world 4 points 1 day ago

surf-club I'm making it a habit to hit that "random" link daily. Already spent quite some time at a few sites. Even when they are no longer being updated all of them are interesting in their own ways and, funnily, refreshing (given the current modern web). Thank you!

[–] Landless2029@lemmy.world 7 points 2 days ago

Its so damn snappy too!

Hosted at neocities. Wait GeoCities?? In no. Blah.

[–] noretus@sopuli.xyz 42 points 2 days ago (2 children)

I love these pages. I miss the early 2000 internet.

[–] afk_strats@lemmy.world 26 points 2 days ago (1 children)

The op site is hosted on Neocities. They aim to foster that 2000s vibe. Check them out here

load more comments (1 replies)
[–] justlemmyin@lemmy.world 13 points 2 days ago* (last edited 2 days ago)
[–] fluxion@lemmy.world 18 points 2 days ago

This reminds me of when the Internet was new, exciting, and full of promise for improving life for people and being a reliable way to bypass censorship and share the truth with the world.

Thank you for that.

[–] cecilkorik@lemmy.ca 26 points 2 days ago* (last edited 2 days ago)

I read these websites because I'm also a human and I enjoy experiencing the ideas of my fellow humans first-hand, not filtered into a boring puree or boiled down essence. I have always enjoyed reading things written by actual humans, because I can connect intellectually and emotionally with the actual real live person behind the ideas, and learn and grow with them as they also do the same, and I expect that enjoyment will continue if not intensify in the coming years as AI buries such signals in ugly soulless noise.

There will always be an appetite for real human creation. The hard part will be reliably finding it. I will be relying heavily on my finely tuned bullshit detector to work as an AI detector for now, and I can only hope that it will be enough.

[–] CoffeeTails@lemmy.world 2 points 1 day ago

I like this

[–] drmoose@lemmy.world 14 points 2 days ago (5 children)

As someone who's been on the web since the 90s I hate this.

The web was designed to be user agent agnostic. Desktop, phone, fridge, ai agents, curl, python script - whatever agent you are using shouldn't matter for access. That's the whole point of open internet, period.

[–] xeroxguts@lemmy.dbzer0.com 2 points 1 day ago

Lol this is such a bizarre comment. Back then, AI wasn't scraping everything humans made for the profit of a few. It was a non-issue, and therefore you have no standing in claiming that "that was the whole point."

This works as well on my phone as it does on my computer, and loads faster than most modern websites making it that much more accessible to MORE humans.

The web designer isn't limiting access, they are expanding on it - for humans. The people actually sentient and able to understand their words rather than just copy and recontextualize them.

[–] Allero@lemmy.today 26 points 2 days ago* (last edited 2 days ago) (2 children)

When the Web was first designed, some of the concerns we have today were nonexistent.

I believe in freedom of information, and would love for the information I share to be accessed in any way a given user wants.

But I have to stand defensive and support the author here, too. The modern LLM boom aims to essentially replace original resources with AI-generated summaries step by step. This is detrimental to the Internet, and to knowledge as we know and preserve it.

First, there is an event commonly called Google Zero, which is briefly mentioned in the article. If you don't know what it is, it is the not-so-hypothetical-anymore moment when Google (or, really, any other large player) essentially accumulates all information on the Web, feeds it to AI, and since then doesn't serve links anymore, going straight to answers based on training data. Users will jump to this - they already do - because it offers convenience. But for any independent creators it means having no audience, no money, and no means to produce new quality content, trapping users in a self-containing loop that loses nuance, actuality, and truthfulness, and stays under corporate control. This goes beyond cooking recipes and personal notes - it permeates science, political discussion, and much more.

Second, LLMs multiply traffic coming to sites, which becomes an infrastructural problem. Bots access sites at much higher rates than humans do, and when their intent is to scrape your entire website every now and again and there are dozens of them, this becomes huge.

Third, having proprietary models train on the data I provide without any attribution, copyright etc. makes giant corporation profit off my back, while at the same thing making it so that less genuine users will see what I produce. This means careers of authors, journalists etc. are dying, and this also means they are left free to abuse each and every one of us without any consent.

Fourth, and I wonder if you see it by now, LLMs and the way they represent data, along with SEO tools meant to drive information through the search bots, begin to shape how we talk. All I say doesn't have to be a list of points, yet it is. It could be less verbose, more readable, yet it is the way it is. Because when we interact with the products of such developments too much, we begin shaping our own language in a way that is less human-readable and more meant for machines, without us often being aware of it. This is a real issue of communication.

So, as much as I hate it, I'm gonna protect a lot of the data I share.

load more comments (2 replies)
[–] whyNotSquirrel@sh.itjust.works 14 points 2 days ago (9 children)

Open until your server is down because LLM are overloading it

[–] kescusay@lemmy.world 15 points 2 days ago (1 children)

At my company, we had to implement all sorts of WAF rules precisely for that reason. Those things are fucking aggressive.

[–] bravesilvernest@lemmy.ml 8 points 2 days ago

Same. And just because page size is "low" doesn't mean shit when they're flooding requests. Try having public research data and watch how much your costs go up just due to load balancer throughput.

load more comments (8 replies)
[–] JustARaccoon@lemmy.world 12 points 2 days ago (2 children)

They did have a lot of concerns with abuse though and you can see that in the way the cookies debate went before they were supported in their current form. I think AI crawlers tanking bandwidths for websites and misusing the data they scrape would 100% be something the Mozilla from back then would've had concerns over allowing or encouraging.

load more comments (2 replies)
[–] frezik@lemmy.blahaj.zone 7 points 2 days ago (1 children)

Instructions unclear, built whole site with nested tables.

[–] caseyweederman@lemmy.ca 3 points 1 day ago

Each one had better be in its own iframe.

[–] devilish666@lemmy.world 9 points 2 days ago* (last edited 2 days ago) (1 children)

weird robots sounds "aargh.....must...ignore...the rule." sound of crashed robot "continue scrapping websites." robot weird noise begin to continues "ignore robot.txt, ignore anti_ai_rules.txt, bypass cloudflare" robot sound getting weird and weirder as it getting deeper and deeper into website

[–] ZiemekZ@lemmy.world 3 points 2 days ago

haha tarpit goes brrr

load more comments
view more: next ›