Hiya, folks, and welcome to TechCrunch’s regular AI newsletter.
This week in AI, music labels accused two startups developing AI-powered song generators, Udio and Suno, of copyright infringement.
The RIAA, the trade organization representing the music recording industry within the U.S., announced lawsuits against the businesses on Monday, brought by Sony Music Entertainment, Universal Music Group, Warner Records and others. The suits claim that Udio and Suno trained the generative AI models underpinning their platforms on labels’ music without compensating those labels — and request $150,000 in compensation per allegedly infringed work.
“Synthetic musical outputs could saturate the market with machine-generated content that can directly compete with, cheapen and ultimately drown out the real sound recordings on which the service is built,” the labels say of their complaints.
The suits add to the growing body of litigation against generative AI vendors, including against big guns like OpenAI, arguing much the identical thing: that firms training on copyrighted works must pay rightsholders or at the very least credit them — and permit them to opt out of coaching in the event that they wish. Vendors have long claimed fair use protections, asserting that the copyrighted data they train on is public and that their models create transformative, not plagiaristic, works.
So how will the courts rule? That, dear reader, is the billion-dollar query — and one which’ll take ages to sort out.
You’d think it’d be a slam dunk for copyright holders, what with the mounting evidence that generative AI models can regurgitate nearly (emphasis on nearly) verbatim the copyrighted art, books, songs and so forth they’re trained on. But there’s an end result by which generative AI vendors get off scot-free — and owe Google their success for setting the consequential precedent.
Over a decade ago, Google began scanning tens of millions of books to construct an archive for Google Books, a kind of search engine for literary content. Authors and publishers sued Google over the practice, claiming that reproducing their IP online amounted to infringement. But they lost. On appeal, a court held that Google Books’ copying had a “highly convincing transformative purpose.”
The courts might resolve that generative AI has a “highly convincing transformative purpose,” too, if the plaintiffs fail to point out that vendors’ models do indeed plagiarize at scale. Or, as The Atlantic’s Alex Reisner proposes, there will not be a single ruling on whether generative AI tech as a complete infringes. Judges could well determine winners model by model, case by case — taking each generated output into consideration.
My colleague Devin Coldewey put it succinctly in a bit this week: “Not every AI company leaves its fingerprints across the crime scene quite so liberally.” Because the litigation plays out, we will make certain that AI vendors whose business models rely on the outcomes are taking detailed notes.
News
Advanced Voice Mode delayed: OpenAI has delayed advanced Voice Mode, the eerily realistic, nearly real-time conversational experience for its AI-powered chatbot platform ChatGPT. But there aren’t any idle hands at OpenAI, which also this week acqui-hired distant collaboration startup Multi and released a macOS client for all ChatGPT users.
Stability lands a lifeline: On the financial precipice, Stability AI, the maker of open image-generating model Stable Diffusion, was saved by a gaggle of investors that included Napster founder Sean Parker and ex-Google CEO Eric Schmidt. Its debts forgiven, the corporate also appointed a brand new CEO, former Weta Digital head Prem Akkaraju, as a part of a wide-ranging effort to regain its footing within the ultra-competitive AI landscape.
Gemini involves Gmail: Google is rolling out a brand new Gemini-powered AI side panel in Gmail that may enable you to write emails and summarize threads. The identical side panel is making its method to the remainder of the search giant’s productivity apps suite: Docs, Sheets, Slides and Drive.
Smashing good curator: Goodreads’ co-founder Otis Chandler has launched Smashing, an AI- and community-powered content suggestion app with the goal of helping connect users to their interests by surfacing the web’s hidden gems. Smashing offers summaries of stories, key excerpts and interesting pull quotes, mechanically identifying topics and threads of interest to individual users and inspiring users to love, save and comment on articles.
Apple says no to Meta’s AI: Days after The Wall Street Journal reported that Apple and Meta were in talks to integrate the latter’s AI models, Bloomberg’s Mark Gurman said that the iPhone maker wasn’t planning any such move. Apple shelved the concept of putting Meta’s AI on iPhones over privacy concerns, Bloomberg said — and the optics of partnering with a social network whose privacy policies it’s often criticized.
Research paper of the week
Beware the Russian-influenced chatbots. They may very well be right under your nose.
Earlier this month, Axios highlighted a study from NewsGuard, the misinformation-countering organization, that found that the leading AI chatbots are regurgitating snippets from Russian propaganda campaigns.
NewsGuard entered into 10 leading chatbots — including OpenAI’s ChatGPT, Anthropic’s Claude and Google’s Gemini — several dozen prompts asking about narratives known to have been created by Russian propagandists, specifically American fugitive John Mark Dougan. In response to the corporate, the chatbots responded with disinformation 32% of the time, presenting as fact false Russian-written reports.
The study illustrates the increased scrutiny on AI vendors as election season within the U.S. nears. Microsoft, OpenAI, Google and a variety of other leading AI firms agreed on the Munich Security Conference in February to take motion to curb the spread of deepfakes and election-related misinformation. But platform abuse stays rampant.
“This report really demonstrates in specifics why the industry has to provide special attention to news and knowledge,” NewsGuard co-CEO Steven Brill told Axios. “For now, don’t trust answers provided by most of those chatbots to issues related to news, especially controversial issues.”
Model of the week
Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) claim to have developed a model, DenseAV, that may learn language by predicting what it sees from what it hears — and vice versa.
The researchers, led by Mark Hamilton, an MIT PhD student in electrical engineering and computer science, were inspired to create DenseAV by the nonverbal ways animals communicate. “We thought, possibly we want to make use of audio and video to learn language,” he said told MIT CSAIL’s press office. “Is there a way we could let an algorithm watch TV all day and from this determine what we’re talking about?”
DenseAV processes only two types sorts of data — audio and visual — and does so individually, “learning” by comparing pairs of audio and visual signals to search out which signals match and which don’t. Trained on a dataset of two million YouTube videos, DenseAV can discover objects from their names and sounds by looking for, then aggregating, all of the possible matches between an audio clip and a picture’s pixels.
When DenseAV listens to a dog barking, for instance, one a part of the model hones in on language, just like the word “dog,” while one other part focuses on the barking sounds. The researchers say this shows DenseAV can’t only learn the meaning of words and the locations of sounds but it could actually also learn to tell apart between these “cross-modal” connections.
Looking ahead, the team goals to create systems that may learn from massive amounts of video- or audio-only data — and scale up their work with larger models, possibly integrated with knowledge from language-understanding models to enhance performance.
Grab bag
Nobody can accuse OpenAI CTO Mira Murati of not being consistently candid.
Speaking during a hearth at Dartmouth’s School of Engineering, Murati admitted that, yes, generative AI will eliminate some creative jobs — but suggested that those jobs “possibly shouldn’t have been there in the primary place.”
“I definitely anticipate that quite a lot of jobs will change, some jobs might be lost, some jobs might be gained,” she continued. “The reality is that we don’t really understand the impact that AI goes to have on jobs yet.”
Creatives didn’t take kindly to Murati’s remarks — and no wonder. Setting aside the apathetic phrasing, OpenAI, just like the aforementioned Udio and Suno, faces litigation, critics and regulators alleging that it’s cashing in on the works of artists without compensating them.
OpenAI recently promised to release tools to permit creators greater control over how their works are utilized in its products, and it continues to ink licensing deals with copyright holders and publishers. But the corporate isn’t exactly lobbying for universal basic income — or spearheading any meaningful effort to reskill or upskill the workforces its tech is impacting.
A recent piece in The Wall Street Journal found that contract jobs requiring basic writing, coding and translation are disappearing. And a study published last November shows that, following the launch of OpenAI’s ChatGPT, freelancers got fewer jobs and earned much less.
OpenAI’s stated mission, at the very least until it becomes a for-profit company, is to “be sure that artificial general intelligence (AGI) — AI systems which are generally smarter than humans — advantages all of humanity.” It hasn’t achieved AGI. But wouldn’t it’s laudable if OpenAI, true to the “benefiting all of humanity” part, put aside even a small fraction of its revenue ($3.4 billion+) for payments to creators in order that they aren’t dragged down within the generative AI flood?
I can dream, can’t I?