The legal filing highlights specific instances where OpenAI’s models reproduced entire passages from Britannica’s archives, undermining the publisher's role as an original source. Rather than functioning as a traditional search tool that directs traffic to external websites, the lawsuit argues that ChatGPT acts as a direct substitute, stripping away the utility of the publishers' platforms. According to the plaintiffs, this unauthorized ingestion of data constitutes a systemic infringement of intellectual property rights, as the models were built using their content without license or compensation.
Encyclopedia Britannica and Merriam-Webster Sue OpenAI Over AI Training
The publishers of Encyclopedia Britannica and Merriam-Webster have escalated the legal battle against generative AI, filing a lawsuit Friday that accuses OpenAI of illicitly harvesting copyrighted material. The complaint alleges that GPT-4 has effectively memorized proprietary content, allowing it to output near-verbatim text that directly cannibalizes the publishers' web traffic.

This litigation aligns with a broader industry pushback against AI firms regarding the ethics of data scraping. The New York Times is currently engaged in a similar legal challenge against OpenAI, while other sectors continue to test the boundaries of fair use. Anthropic recently set a significant precedent in this landscape by settling a class-action lawsuit with authors for $1.5 billion, signaling that the cost of training large language models on protected works is becoming a central liability for the industry.




Comments (0)
No comments yet. Be the first!