The one exception to this is umg vs anthropic Case in point, because at least initially, older versions of Anthropic generated song lyrics as output. This is a problem. The current status of that case is that they have taken safeguards to prevent this from happening, and the parties have agreed that until the resolution of the case, those safeguards are sufficient, so they are no longer seeking a Preliminary injunction.
At the end of the day, the tough question for AI companies isn't Is it legal to attend training? Its What do you do when your AI produces output similar to a particular task?
Do you expect most of these cases to go to trial, or do you see settlements on the horizon?
There may also be some compromises. Where I expect to see compromise is with the big players, who either have large amounts of content or content that is particularly valuable. The New York Times could end up with an agreement and a licensing deal, perhaps where OpenAI pays money to use the New York Times content.
There is so much money at stake that perhaps we will get at least some decisions that will set the standards. The class-action plaintiffs, I think, have stars in their eyes. There are a lot of class actions, and my guess is that the defendants will oppose them and hope to win on summary judgment. It is not clear whether they go to trial. in the supreme court google vs oracle The case pushed fair use law very strongly towards resolving it not in front of a jury, but on summary judgment. I think the AI companies are going to work very hard to get those cases decided on summary judgment.
Why would it be better for them to win on summary judgment than a jury verdict?
It is faster and cheaper than testing. And AI companies are worried that they will not be considered as popular as many people would think, Oh, you copied work that should be illegal And don't go too deep into the details of the fair use doctrine.
Many deals have been signed between AI companies media outletsContent providers, and other rights holders. Most of the time, these deals seem to be more about discovery than fundamental models, or at least that's how it's been described to me. In your opinion, is licensing the content used in AI search engines – where answers are derived by retrieval augmented generation or RAG – something that is legally mandated? Why are they doing it this way?
If you are using recovery enhanced generation on targeted, specific content, your fair use argument becomes more challenging. It is much more likely that an AI-generated search will produce in the output text taken directly from a particular source, and much less likely to be used appropriately. really can Ho-but the risky area is that it is more likely to compete with the original source material. If instead of directing people to the New York Times story, I give them my AI prompt that uses RAG to take text directly from the New York Times story, it seems like a replacement for the New York Times. Can cause harm. The legal risk is high for an AI company.
What do you want people to know about generative AI copyright fights that they may not already know about, or may have been misinformed about?
The thing I often hear that is wrong as a technical matter is the concept that these are just plagiarism machines. They're just taking my stuff and then sending it back in the form of texts and reactions. I've heard a lot of artists say that, and I've heard a lot of ordinary people say that, and it's not true as a technical matter. You can decide whether generative AI is good or bad. You can decide whether it is legal or illegal. But this is really a fundamentally new thing that we have not experienced before. The fact that it requires training on a set of material to understand how sentences work, how arguments work, and various facts about the world does not mean that it Just copying and pasting things or creating a collage. It's really producing things that no one could have expected or predicted, and it's giving us a lot of new material. I think it's important and valuable.