Like many AI firms, Udio and Suno relied on large-scale theft to create their generative AI models. This they’ve as much as admitted, even before the music industry’s latest lawsuits against them have gone before a judge. If it goes before a jury, the trial could possibly be each a dangerous exposé and a highly useful precedent for similarly unethical AI firms facing certain legal peril.
The lawsuits were filed Monday with great fanfare by the Recording Industry Association of America, putting us all within the uncomfortable position of rooting for the RIAA, which for many years has been the bogeyman of digital media. I actually have received nastygrams from them! The case is just that clear.
The gist of the 2 lawsuits, that are extremely similar in content, is that Suno and Udio (strictly speaking, Uncharted Labs doing business as Udio) indiscriminately pillaged kind of all the history of recorded music to form datasets, which they then used to coach a music-generating AI.
And here allow us to quickly note that these AIs don’t “generate” a lot as match the user’s prompt to patterns from their training data after which attempt to finish that pattern. In a way, all these models do is perform covers or mashups of the songs they ingested.
That Suno and Udio did ingest said data is, for all intents and purposes (including legal ones), unquestionably the case. The businesses’ leadership and investors have been unwisely loose-lipped concerning the copyright challenges of the space.
They’ve admitted that the one method to create a very good music generation model is to ingest a considerable amount of high-quality music, much of which might be copyrighted. It is vitally simply a vital step for creating machine learning models of this kind.
Then they admitted that they did so without the copyright owners’ permission. Investor Brian Hiatt told Rolling Stone just a number of months ago:
Truthfully, if we had deals with labels when this company got began, I probably wouldn’t have invested in it. I feel that they needed to make this product without the constraints.
Tell me you stole a century of music without telling me you stole a century of music, got it. To be clear, by “constraints,” he’s referring to copyright law.
Last, the businesses told the RIAA’s lawyers that they consider swiping all this media falls under fair use doctrine — which fundamentally only comes into play in unauthorized use of a piece. Now, fair use is admittedly a fancy and hazy concept in idea and execution. But an organization with $100 million in its pockets stealing every song ever made so it will probably replicate them in great deal and sell the outcomes: I’m not a lawyer, but that does appear to stray somewhat outside the intended protected harbor of, say, a seventh-grader using a Pearl Jam song within the background of their video on global warming.
To be blunt, it looks like these firms’ goose is cooked. They clearly hoped that they may take a page from OpenAI’s playbook, secretly using copyrighted works, then using evasive language and misdirection to stall their less deep-pocketed critics, like authors and journalists. If by the point the AI firms’ skulduggery is revealed, they’re the one option for distribution, it not matters.
In other words: Deny, deflect, delay. Ideally you may spin it out until the tables turn and also you make deals along with your critics — for LLMs, it’s news outlets and the like, and on this case it might be record labels, which the music generators clearly hoped to eventually come to from a position of power. “Sure, we stole your stuff, but now it’s an enormous business; wouldn’t you relatively play with us than against us?” It’s a standard strategy in Silicon Valley and a winning one, because it mainly just costs money.
Nevertheless it’s harder to tug off when there’s a smoking gun in your hand. And unfortunately for Udio and Suno, the RIAA included a number of thousand smoking guns within the lawsuit: songs it owns which can be clearly being regurgitated by the music models. Jackson 5 or Maroon 5, the “generated” songs are only frivolously garbled versions of the originals — something that might be inconceivable if the unique weren’t included within the training data.
The character of LLMs — specifically, their tendency to hallucinate and lose the plot the more they write — precludes regurgitation of, for instance, entire books. This has likely mooted a lawsuit by authors against OpenAI, because the latter can plausibly claim the snippets its model does quote were grabbed from reviews, first pages available online and so forth. (The newest goalpost move is that they did use copyright works early on but have since stopped, which is funny since it’s like saying you simply juiced the orange once but have since stopped.)
What you can’t do is plausibly claim that your music generator only heard a number of bars of “Great Balls of Fire” and one way or the other managed to spit out the remaining word for word and chord for chord. Any judge or jury would laugh in your face, and with luck a court artist could have their likelihood at illustrating that.
This just isn’t only intuitively obvious but legally consequential as well, because it’s clear that the models are re-creating entire works — poorly sometimes, to make sure, but full songs. This lets the RIAA claim that Udio and Suno are doing real and major harm to the business of the copyright holders and artists being regurgitated — which lets them ask the judge to shut down the AI firms’ whole operation on the outset of the trial with an injunction.
Opening paragraphs of your book coming out of an LLM? That’s an mental issue to be discussed at length. Dollar-store “Call Me Perhaps” generated on demand? Shut it down. I’m not saying it’s right, however it’s likely.
The predictable response from the businesses has been that the system just isn’t intended to duplicate copyrighted works: a desperate, naked try to offload liability onto users under Section 230 protected harbor. That’s, the identical way Instagram isn’t liable when you use a copyrighted song to back your Reel. Here, the argument seems unlikely to realize traction, partly due to aforementioned admissions that the corporate itself ignored copyright to start with.
What might be the consequence of those lawsuits? As with all things AI, it’s quite inconceivable to say ahead of time, since there’s little in the best way of precedent or applicable, settled doctrine.
My prediction, again lacking any real expertise here, is that the businesses might be forced to reveal their training data and methods, these items being of clear evidentiary interest. Seeing these and their obvious misuse of copyrighted material, together with (it is probably going) communications indicating knowledge that they were breaking the law, will probably precipitate an try to settle or avoid trial, and/or a speedy judgment against Udio and Suno. They will even be forced to stop any operations that depend on the theft-based models. No less than one in every of the 2 will try to proceed business using legally (or at the very least legally adjoining) sources of music, however the resulting model might be an enormous step down in quality, and the users will flee.
Investors? Ideally, they’ll lose their shirts, having placed their bets on something that was obviously and provably illegal and unethical, and not only within the eyes of nebbish writer associations but based on the legal minds on the infamously and ruthlessly litigious RIAA. Whether the damages amount to the money readily available or promised funding is anyone’s guess.
The results could also be far-reaching: If investors in a hot latest generative media startup suddenly see 100 million dollars vaporized attributable to the elemental nature of generative media, suddenly a unique level of diligence seems appropriate. Corporations will learn from the trial (if there’s one) or settlement documents and so forth what might have been said, or perhaps more importantly, what shouldn’t have been said, to avoid liability and keep copyright holders guessing.
Though this particular suit seems almost a foregone conclusion, not every AI company leaves its fingerprints across the crime scene quite so liberally. It should not be a playbook to prosecuting or squeezing settlements out of other generative AI firms, but an object lesson in hubris. It’s good to have one in every of those every from time to time, even when the teacher happens to be the RIAA.