AI company Anthropic to shell out $1.5 billion to authors in a lawsuit settlement over allegations of using unauthorized books to train their AI chatbot models.

Anthropic, a leading artificial intelligence (AI) company, has agreed to pay a staggering $1.5 billion to settle a class-action lawsuit by book authors over the illegal downloading of books to train its chatbot, Claude. This settlement, believed to be the largest copyright recovery ever, marks a significant milestone in the ongoing legal battles between AI companies and creative professionals over copyright infringement.

The Authors Guild, led by Mary Rasenberger, represented the authors in the lawsuit against Anthropic in the United States. The settlement covers approximately 500,000 books allegedly downloaded illegally, with Anthropic agreeing to destroy the original book files as part of the agreement.

Documents disclosed in court showed internal concerns among Anthropic employees about the legality of their use of pirate sites. The company downloaded more than 7 million digitized books, including over 5 million copies from the pirate website Library Genesis and at least 2 million copies from the Pirate Library Mirror.

The settlement is the first of its kind in the AI era and could influence other disputes, including an ongoing lawsuit by authors and newspapers against OpenAI and its business partner Microsoft, and cases against Meta and Midjourney.

Anthropic, a privately held AI company, put its value at $183 billion after raising another $13 billion in investments. Despite this, the company has never reported making a profit, expecting to make $5 billion in sales this year.

However, the company later shifted its approach and began buying books in bulk, tearing off the bindings and scanning each page before feeding the digitized versions into its AI model. This change in strategy was not enough to avoid the lawsuit, and if Anthropic had not settled, losing the case after a scheduled December trial could have cost the company multiple billions of dollars.

The settlement could be a turning point in legal battles between AI companies and creative professionals over copyright infringement. Thomas Heldrup, the head of content protection and enforcement at the Danish Rights Alliance, criticized the tech industry's approach of growing a business and later paying a fine for rule-breaking.

In a related development, another group of authors sued Apple on Friday in the same San Francisco federal court. The Danish Rights Alliance, which previously successfully fought to take down one of the shadow libraries, expressed concerns about the settlement's impact on European writers and publishers.

One lead plaintiff in the case, Andrea Bartz, author of the debut thriller novel "The Lost Night", was among those found in the dataset. The settlement could set a precedent for future cases involving AI companies and copyright infringement.

In a surprising twist, a federal judge found that training AI chatbots on copyrighted books wasn't illegal, but Anthropic wrongfully acquired millions of books through pirate websites. Judge Alsup's June ruling found that the use of copyrighted works qualified as "fair use" under U.S. copyright law, but the acquisition of the books was not.

The settlement is a clear indication that AI companies must tread carefully when it comes to copyrighted material. As books continue to be important sources of data needed to build AI language models behind chatbots like Anthropic's Claude and OpenAI's ChatGPT, this case serves as a reminder of the importance of respecting intellectual property rights.

AI company Anthropic to shell out $1.5 billion to authors in a lawsuit settlement over allegations of using unauthorized books to train their AI chatbot models.