Download

The Bartz v. Anthropic Settlement: Understanding America's Largest Copyright Settlement

Dave Hansen (Authors Alliance)

November 10, 2025

Phillip Burton Federal Building, San Francisco

When Anthropic agreed to pay $1.5 billion to settle a copyright lawsuit in August 2025, it became the largest copyright settlement in U.S. history. Three authors had sued, but nearly half a million ended up in the class. And a quarter of the money will go to lawyers.

How did we get here? The answer involves several features of American law that, combined, significantly affect the litigation strategy for copyright AI suits in the US. This post attempts to explain the suit and its settlement for those watching outside the US.

Background: What Led to This Moment

In August 2024, three authors—Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson—filed a lawsuit against Anthropic, the AI company behind the Claude language models. They claimed Anthropic had downloaded millions of copyrighted books from "shadow libraries" like Library Genesis (LibGen) and Pirate Library Mirror (PiLiMi) to train its AI systems.

Court documents showed Anthropic downloaded over 7 million books from these pirate sites, in part to use as training data for its AI development. It also bought and scanned millions more for the same purpose.

Initially, the lawsuit attacked the entire practice of using copyrighted works to train AI. But in June 2025, Judge William Alsup of the Northern District of California split the baby. He ruled that Anthropic's use of legally acquired books for AI training was "quintessentially transformative" and protected as fair use—but simultaneously held that downloading and keeping pirated copies was not fair use, and strongly suggested that he would find such uses to be straight-up infringement. This split decision left Anthropic partially vindicated on the big question of AI training but still facing massive liability for downloading books from LibGen and similar sources.

Then came a procedural decision that really changed the trajectory of the suit. In August 2025, Judge Alsup certified the case as a class action, defining the class to include "all beneficial or legal copyright owners of the exclusive right to reproduce copies of any book" in the LibGen or PiLiMi datasets that met his criteria. As the Authors Alliance observed, nobody had asked for this exact class definition—the judge invented it himself, sweeping in not just authors but publishers, estates, and anyone else with reproduction rights. This certification turned three individual authors into representatives of thousands of rightsholders into a mega-lawsuit representing nearly half a million works. With statutory damages potentially reaching $150,000 per work, Anthropic was suddenly staring down the barrel of theoretical liability exceeding $70 billion.

By late August 2025, with trial set for December, the parties blinked. They announced a settlement that would eventually be preliminarily approved in September, covering those 482,460 books that made it through the class definition's filters.

The American Class Action System: A Foreign Concept for Many

The U.S. class action system needs some explaining. Though parallel systems exist in other legal systems around the world, the American class action has some unique characteristics. It allows one or a few plaintiffs to sue on behalf of everyone in a similar situation—here, all copyright owners whose books Anthropic allegedly grabbed from pirate sites.

Here's the part that often surprises international observers: most American class actions aren't really driven by the named plaintiffs. They're driven by specialized law firms that essentially gamble millions of dollars on complex litigation, hoping for a massive payday if they win. These firms front all the costs—expert witnesses, document review, years of lawyer time—betting they'll recover through legal fees. In this case, the plaintiff's lawyers from Lieff Cabraser Heimann & Bernstein and Susman Godfrey are asking for up to 25% of the settlement fund. That's $375 million. And that's actually on the low end for class actions, which can take up to 33% or more.

Judge Alsup certified the class to include "all beneficial or legal copyright owners of the exclusive right to reproduce copies of any book" in the pirated datasets, with the proviso that those works must have an ISBN/ASIN and also be registered with the U.S. Copyright Office. Note the breadth—not just authors, but publishers, estates, anyone with reproduction rights. And here's something important: the class isn't limited to Americans. If you're a British author or a German publisher whose registered works ended up in LibGen, you're potentially in the class. Foreign rightsholders are less likely to have U.S. copyright registrations (you don't need them for basic copyright protection, only for statutory damages), but many international works do get registered. Non-U.S. authors and publishers should definitely check the settlement database.

The real power of class actions comes from aggregation. Suing over one pirated book would be a gamble, and the legal fees would likely dwarf any recovery. But bundling potentially millions of works together creates real leverage. In this case, the end result of the certified class meant the aggregation of claims for some 482,460 works. Combine that with another American peculiarity—statutory damages—and you have a recipe for massive settlements.

Unlike most countries, where copyright damages are tied to actual harm or profits, U.S. law provides for predetermined "statutory damages": $750 to $30,000 per work for ordinary infringement, up to $150,000 for willful infringement. The math is outlandish: 482,460 books times $150,000 equals $72 billion. Even at the minimum $750, you're looking at $360 million.

This, I think, answers why Anthropic settled despite winning the fair use argument for LLM training. This is "bet-the-company" litigation. When losing could mean owing more money than your company is worth, you settle.

The Settlement Agreement: What It Does and Doesn't Cover

The Substance

American class action law allows parties to settle in a way that binds the entirety of the class to the negotiated settlement. Obviously, with only a few class members representing a large number of absent ones, this raises some risk for abuse, so the law has some procedural safeguards. Namely, it requires that the court must approve the settlement and hold a hearing to ensure that the settlement is “fair, reasonable, and adequate.”

In the Bartz case, the court has not yet done this. Right now, the settlement has received preliminary approval (allowing the settlement claims process to go forward) but the court is awaiting more information (e.g., about how many people opt out, how many claimants come forward, etc.) before final approval at a fairness hearing, which is set for April 2026.

The settlement's core terms are relatively straightforward, though their implementation is complex:

Monetary compensation: Anthropic will pay $1.5 billion into a settlement fund. The settlement fund must cover:

Attorneys' fees: The plaintiff lawyers are seeking 25% of the fund, or $375 million—a staggering sum that illustrates the business model of class action litigation
Administrative costs: Settlement administration by JND will likely cost several million dollars
Service awards: $50,000 per named plaintiff (totaling $150,000 for the three class representatives)
Special Master fees: For resolving disputes between class members
Working group fees: For industry experts advising on allocation

After these deductions, the remainder will be distributed equally among valid claimants—approximately $3,100 per work, though this could increase if not all rightsholders make claims.

Destruction of materials: Anthropic must destroy the pirated libraries and any derivative copies within 30 days of final judgment, providing written certification to class counsel.

Limited scope: Crucially, this settlement only releases Anthropic from liability for past conduct—specifically, its acquisition, retention, and use of the identified pirated works before August 25, 2025. It does not:

Create any ongoing licensing scheme for future AI training
Cover claims based on AI outputs that might infringe copyrights
Affect Anthropic's ability to train on lawfully acquired materials
Set precedent for how AI training should be regulated globally

This narrow scope distinguishes it from more ambitious settlements like the failed Google Books settlement, which attempted to create a forward-looking licensing framework.

The Process

The settlement process is somewhat complex, as you might imagine when one must identify, notify, and pay thousands of potential claimants. The settlement administrator, a private company that specializes in such work, named JND, has compiled contact information for approximately 243,397 unique authors (66% of authors) and 15,786 unique publishers (97% of publishers) represented in the works list. This is no small task. Recent court filings show that, through October 31, 2025, class members have made claims for 58,788 (about 12%) of the works included in the settlement. Though this number is still small, it does seem to indicate that word is getting out.

How funds are split is also important. The settlement sets a default allocation between authors and publishers at 50/50, though parties can deviate from this by providing documentation of their publishing contracts. For books with multiple authors, the author's 50% share is divided among them, while the publisher receives the full 50%. For example, as the settlement documents explain, "in the case of a book published by a university press with three co-authors, all four parties (the university press and the three co-authors) submit valid claims, and all four parties elect to follow the default rules. If the per-work allocation is $3,000, then $1,500 is distributed to the university press and $500 is distributed to each co-author."

Educational publishers have special provisions recognizing the complexity of textbook contracts, allowing them to claim different splits without documentary proof unless disputes arise.

The settlement numbers tell a stark story about who really benefits. Settlement administrators found contact information for 97% of publishers but only 66% of authors. This makes sense as publishers are better organized, have lawyers on retainer, and know how to work the claims process. A quick search of the works list database shows that some of the largest publishers stand to win the most – “Elsevier” returns about 12,000 works; “Wiley” another 20,000; “Harpercollins” at about 19,500. Even subtracting 25% for the lawyers, and splitting the remaining funds 50/50 with authors, I’d expect to see multi-million dollar payouts for those publishers.

International observers should note that this allocation system largely reflects American publishing practices, where contracts vary widely in how they allocate rights and revenues. In countries with stronger statutory protections for authors' rights, such divisions might be predetermined by law rather than left to private negotiation. The settlement also includes provisions for disputes: if co-owners disagree about splits, the matter goes to a Special Master for binding resolution. If any co-owner opts out of the settlement, the entire work is removed from the class.

The settlement is moving toward final approval, with these key dates:

January 7, 2026: Opt-out deadline
March 2, 2026: Re-inclusion deadline (if you opted out but changed your mind)
March 23, 2026: Claims deadline

International rightsholders: check the settlement website now. If your work was registered with the U.S. Copyright Office before August 2022 and ended up in LibGen, you might have money waiting.

How Bartz may affect future litigation and settlement strategy

The Bartz settlement has given plaintiffs' lawyers a roadmap. As Professor Ed Lee at chatgptiseatingtheworld.com explains, law firms like Susman Godfrey have developed what he calls "The Shadow Library Strategy,” which attempts to skip the debate over whether AI training is fair use (judges so far seem to think it is) and go straight for the piracy angle.

Most new book copyright lawsuits against AI companies–most of which are also class actions– now allege the defendants downloaded from shadow libraries. Whether they will be successful is another question. In the Kadrey v. Meta case, we saw very similar facts – documents reveal that Meta executives, including CEO Mark Zuckerberg, allegedly approved using LibGen despite internal warnings about legal risks. According to the filings, companies felt they had no choice—as one Meta employee explained, "LibGen is essential to meet SOTA [state of the art] numbers" and competitors like OpenAI and Mistral were believed to be using the same sources. Yet, in Kadrey, the court did not follow Judge Alsup’s lead in Bartz, and treated the downloaded copies as a necessary and integral part of Meta’s overall effort to train its AI models (which the court ultimately concluded was fair use).

To the extent that Bartz is the pattern for future litigation, the settlement leaves AI companies in an odd position. Judge Alsup's ruling says training on legally purchased books is fair use—that's a green light. But the massive payout for downloading and storing books from LibGen raises all sorts of other questions – what about websites scraped online without obvious permission?

For AI companies outside the U.S., this case is a warning about American legal risk. Even if AI training ultimately proves legal, defending a class action can cost millions just to get to trial. And with statutory damages in play, the downside risk is catastrophic. Some companies will likely sign licensing deals they don't strictly need, just to avoid the possibility of a lawsuit.

Conclusion

Nobody really won in this suit. Authors and publishers get money but no control over future AI training. Anthropic writes a massive check, but it already won on fair use for training its LLM and it gets to keep the scans it makes of legally purchased books. The AI industry gets some guidance (don't download your books from LibGen) but still faces uncertainty about outputs and future regulation.