234
Stephen King: My Books Were Used to Train AI
(www.theatlantic.com)
This is a most excellent place for technology news and articles.
By that definition of copying Google is infringing on millions of copyrights through their search engine, and anyone viewing a copyrighted work online is also making an unauthorized copy. These companies are using data from public sources that others have access to. They are doing no more copying than a normal user viewing a webpage.
I don't think so. Your comparisons aren't really relevant. If Google scrapes a page containing copywritten material inadvertently and serves this to a user there are mechanisms to take down that content or face a lawsuit. Try posting a movie on Youtube, if a copyright holder notifies Google that content will be taken down.
Training a LLM is different, that material was used to help build the model and is now a part of that product. That creates a legal liability.