Please don’t do this and keep information easy to google. The best part of Reddit was how much hours of time it saves when googling for information on stuff
There are plenty of instances that copy the original content. As an instance owner that runs a only a single project specific community, I should be able to decide what content is available on my domain, and what isn't. Don't you think?
Aside from the questionable content, there is also legal issues around it that I'd rather not deal with.
Yes, its your choice. I would prefer it if this is barely done to increase the likely hood of information being indexed and easily found on google searches though.
If you do this, I'd recommend excluding at least your most common communities. Google searching Reddit has been a great tool over the years, and improved discoverablity of the service as a whole. Especially for smaller communities.
Feels kind of like shooting yourself in the foot. Maybe just exclude NSFW communities (though, do those even exist here?)
I agree, you do you, but IMO if you want to host a lemmy instance (that's not private), this is kind of part of the deal. If you host communities, you are literally opening yourself up like this.
There is no way to exclude individual communities. The post URLs are generic, like /post/1234. From nginx or other proxies, I cannot tell what community they belong to. I would love to have my own be searchable, but not at the price of tainting my project's reputation.
Would it be a better idea to exclude any URLs that are similar to /c/*@*.*
I think that would block external communities but keep local ones still indexable in their native locations.
Or maybe the lemmy source code should include a canonical tag to the original host’s post?
Lemmy
Everything about Lemmy; bugs, gripes, praises, and advocacy.
For discussion about the lemmy.ml instance, go to !meta@lemmy.ml.