So judges are saying:
If you trained a model on a single copyrighted work, then that would be a copyright violation because it would inevitably produce output similar to that single work.
But if you train it on hundreds of thousands of copyrighted works, that’s no longer a copyright violation, because output won’t closely match any single work.
How is something a crime if you do it once, but not if you do it a million times?
It reminds me of the scheme from Office Space: https://youtu.be/yZjCQ3T5yXo