19
Using the lemmyverse user generated content to train AI
(sh.itjust.works)
Home of the sh.itjust.works instance.
They have to pay Reddit now as the api is gone. I’m quite certain that at least one of the companies scraping the web to train their LLM have been using it.
And I’m quite certain that this happens to fediverse as well. You don’t even need an api, just set up your own instance. Make a few thousand accounts and sub all over using these. You got all the data in a nice db