Reddit Inc. has launched lawsuits against startup Perplexity AI Inc. and three data-scraping service providers for trawling the corporate’s copyrighted content for use to coach AI models.
Reddit compared the information scraping corporations — SerpApi, Oxylabs and AWMProxy — to “bank robbers,” adding that one among the firms “will apparently do anything to get the Reddit data it desperately must fuel its ‘answer engine’ — that’s, anything aside from enter into an agreement with Reddit directly, as a few of its competitors have done.”
Several AI have already made deals with Reddit, including OpenAI, which signed on the dotted line last 12 months to make use of Reddit’s trove of knowledge to coach its large language models. Though no number was given, it was reported that the deal was price $60 million. On the time, Reddit said it hoped to herald around $200 million from licensing agreements over the subsequent three years, with Google LLC also signing on.
The corporate later launched a lawsuit against Anthropic PBC, claiming it was scraping content on Reddit to coach its Claude family of AI models. That makes this latest lawsuit, filed today within the U.S. District Court for the Southern District of Latest York, one among a handful ongoing.
Data scraping firms are a reasonably latest phenomenon that appeared shortly after the generative AI explosion. In accordance with the Latest York Times, SerpApi is predicated in Texas and serves quite a few corporations. Oxylabs is run out of Lithuania, and AWMProxy is Russian.
“AI corporations are locked in an arms race for quality human content — and that pressure has fueled an industrial-scale ‘data laundering’ economy,” Ben Lee, the chief legal officer at Reddit, told The Times. “Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material.”
In accordance with the lawsuit, Reddit claimed it set a trap for Perplexity by publishing a “test post” on its platform that was visible only to Google’s search engine and inaccessible anywhere else on the web. Inside hours, the content of that hidden post appeared in Perplexity’s search results, Reddit said.
Perplexity has said it hasn’t yet received the lawsuit, but told media it should “fight vigorously for users’ rights to freely and fairly access public knowledge.” It added, “Our approach stays principled and responsible as we offer factual answers with accurate AI, and we won’t tolerate threats against openness and the general public interest.”
Photo: Unsplash
Support our mission to maintain content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
- 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
- 11.4k+ theCUBE alumni — Connect with greater than 11,400 tech and business leaders shaping the longer term through a novel trusted-based network.
About SiliconANGLE Media
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our latest proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to assist technology corporations make data-driven decisions and stay on the forefront of industry conversations.