Reddit Files Federal Lawsuit Against Perplexity AI and Three Data Scrapers

Image for Reddit Files Federal Lawsuit Against Perplexity AI and Three Data Scrapers

Reddit has initiated a federal lawsuit against AI search startup Perplexity AI and three other data-scraping firms, alleging an "industrial-scale, unlawful operation" to harvest user content for AI training. The complaint, filed in a US District Court in New York, accuses the defendants of violating copyright laws and circumventing digital protections. Jesse Dwyer, Head of Communications at Perplexity, sarcastically commented on the situation via social media, stating: > "Be careful not to cite this Reddit post explaining our response to a lawsuit, or else @Reddit might try to sue you, too."

The lawsuit claims Perplexity is a "willing customer" in a "data laundering economy," working with Lithuania-based Oxylabs UAB, Texas-based SerpApi, and AWMProxy to bypass Reddit's anti-scraping technology. Reddit alleges these firms circumvented Google's controls to scrape content directly from search engine results. The company reportedly proved this by setting a hidden "test post" on its platform, which later appeared in Perplexity's generated answers.

Perplexity has publicly stated its intention to fight the allegations, asserting that it does not train AI models on content. In a statement, the company explained, "Whenever anyone asks us about content licensing, we explain that Perplexity, as an application-layer company, does not train AI models on content. Never has. So it is impossible for us to sign a license agreement to do so." Perplexity maintains it only summarizes and cites Reddit discussions, similar to how users share links, and views Reddit's actions as contrary to an open internet.

Reddit's Chief Legal Officer, Ben Lee, highlighted the value of Reddit's "vast archive of human discussion" for AI models, noting the company has secured lucrative licensing deals with major players like Google and OpenAI. Reddit previously sent a cease-and-desist letter to Perplexity in May 2024, but observed an increase in Reddit citations by Perplexity afterward. The lawsuit seeks unspecified monetary damages and a permanent court order to prevent unauthorized use of its content.

This legal battle is anticipated to significantly influence the legal standards for training AI models on publicly available web data. Perplexity's Jesse Dwyer affirmed the company's commitment to "fight vigorously for users’ rights to freely and fairly access public knowledge," adding that their approach "remains principled and responsible."