November 17, 2025

Reddit Sues Perplexity ( A.I. System) For Data Stealing

Reddit filed a federal lawsuit in New York on 22 October 2025, accusing Perplexity AI and three data-scraping companies — Lithuania-based Oxylabs, Russia-linked AWMProxy, and Texas-based SerpApi — of unlawfully mining Reddit’s user-generated content for commercial use in training artificial-intelligence systems.

According to the complaint, Reddit alleges that the scraping firms circumvented Reddit’s protective measures by using stealth methods (such as scraping via Google search results) to harvest millions of posts. Perplexity is accused of purchasing this scraped content rather than entering into a formal licensing deal with Reddit, despite Reddit having existing agreements with major AI firms like Google and OpenAI.

The complaint claims that after Reddit issued a cease-and-desist letter in May 2024, Perplexity nonetheless increased the volume of Reddit citations in its “answer engine” forty-fold. Reddit compares the behaviour of the scraping companies to bank robbers: unable to access the vault (Reddit’s API), they intercept the armored truck (Google search results).

Perplexity and the scraping firms deny wrongdoing (or have not yet publicly responded in detail), stating that they respect open access to public knowledge and will contest the allegations. Meanwhile, Reddit seeks monetary damages and an injunction to stop further unauthorized use of its content.

This legal action comes amid a broader wave of disputes between content providers and AI-training firms over data usage, consent, licensing and copyright. Reddit previously sued another AI company, Anthropic, earlier in 2025 over similar scraping claims.


Key Points

  • Reddit is suing Perplexity AI and three data-scraping firms for allegedly acquiring Reddit’s content without permission and using it to train AI models.
  • The lawsuit alleges stealth scraping through search-engine indexing rather than direct API access, and points to a dramatic increase in Reddit-sourced content after a cease-and-desist.
  • Reddit has existing licensing agreements with other AI firms but contends that Perplexity did not enter into such an arrangement.
  • The case reflects a broader clash over who controls and how one pays for user-generated data in the age of large-language-model systems.
  • The outcome could affect business models of AI companies, licensing practices of content platforms, and regulatory frameworks around data-use and consent.

Projections & What It Means for the Future

Data ownership and licensing shift: If Reddit succeeds, we may see a surge in licensing negotiations between user-platforms and AI firms, changing how training data is acquired and monetized.

Precedent for AI training standards: The case may determine legal boundaries for scraping or using publicly visible content for commercial AI training — potentially raising compliance costs for AI startups.

Business model disruption: AI firms that rely on broad, unlicensed scraping may face increased legal risk and may need to adjust their data-acquisition strategies. Reddit’s action may push others into formal agreements rather than relying on aggregated “public data.”

Regulatory and legislative ripple effects: Governments and regulators might respond by clarifying or tightening laws around digital-data scraping, fair-use, platform liability and user content rights.

Platform economics and user value: Platforms like Reddit may gain stronger leverage in how their user-generated content is used commercially — possibly altering revenue sharing, API access fees and platform strategy.

Innovation vs ethical use tension: The balance between enabling advance of AI systems and protecting content-creators and platforms will grow sharper. Legal outcomes may tilt toward greater restraint on AI-training methods or push engineering innovation toward more transparent, licensed data pipelines.

In sum: This lawsuit is more than a platform-vs-AI dispute; it may define who holds the rights to vast troves of human-written content, how that content is monetized or leveraged, and what the legal guardrails for AI training will be.


References

  • “Reddit sues Perplexity, others over alleged data scraping” — Reuters, 22 Oct 2025.
  • Associated Press: “Reddit sues AI company over ‘industrial-scale’ scraping of user comments.”
  • Supplementary reporting via The Verge and Bloomberg on the case and context.