Perplexity AI Lawsuit: Reddit Accused of Google Data Theft
“`html
Reddit Sues Perplexity AI Over Data Scraping and Policy Violations
Table of Contents
updated October 24,2023,12:19:42 PM PST
what Happened?
Reddit filed a lawsuit against Perplexity AI on September 19,2023,alleging the AI company engaged in ”large-scale misappropriation” of Reddit data and circumvented Reddit’s technological safeguards. the complaint, filed in the U.S. District Court for the Northern District of California, accuses Perplexity of using data scraped from Reddit to train its AI models without a licensing agreement. The Verge first reported the news.
Reddit argues that Perplexity bypassed Reddit’s robots.txt file and other measures designed to prevent automated data collection. Specifically, Perplexity allegedly accessed Reddit content through Google Search Engine Results Pages (SERPs), a method Reddit considers a circumvention of its control measures. This practise, Reddit claims, “damaged” the platform by undermining its ability to control data access, usage, and adherence to its privacy policies and user agreements.
Why is Reddit Concerned?
Reddit’s concerns extend beyond the immediate data scraping. The company fears that if Perplexity’s workaround becomes widespread, it could jeopardize existing licensing deals with other AI companies. Reddit has invested “significant resources” in anti-scraping technology, and unauthorized data access undermines these efforts. The lawsuit details potential damages including lost profits, harm to Reddit’s reputation, and erosion of user trust.
The lawsuit specifically targets Perplexity’s method of accessing data through Google SERPs. Reddit wants the court to issue an injunction preventing companies from scraping Reddit content via this route. It also seeks to block the sale of Reddit data and the growth of technologies designed to circumvent Reddit’s data protection measures.
Potential Outcomes and Damages
If Reddit prevails,the court could order an injunction preventing Perplexity and potentially other companies from scraping Reddit data. Furthermore, Reddit could be awarded considerable damages or be allowed to recover profits Perplexity earned from using the scraped data. The lawsuit doesn’t specify a dollar amount for potential damages.
The outcome of this case could have significant implications for the AI industry and the future of data access. It raises questions about the rights of platform owners to control their data and the extent to which AI companies can rely on publicly available information for training their models.
The Broader Context: AI and Data Scraping
This lawsuit is part of a larger trend of content creators and platforms pushing back against the use of their data to train AI models. Many AI companies rely on vast datasets scraped from the internet, raising concerns about copyright infringement, privacy violations, and fair use.Other companies, like Twitter (now X), have also taken steps to restrict data scraping. TechCrunch reported on Twitter’s similar actions in July 2023.
The legal landscape surrounding data scraping for AI training is still evolving.There is ongoing debate about whether scraping publicly available data constitutes fair use, and courts are beginning to grapple with these issues. This case will likely contribute to the development of legal precedents in this area.
<
