- cross-posted to:
- technology@lemmy.world
- cross-posted to:
- technology@lemmy.world
Copyright class actions could financially ruin AI industry, trade groups say.
AI industry groups are urging an appeals court to block what they say is the largest copyright class action ever certified. They’ve warned that a single lawsuit raised by three authors over Anthropic’s AI training now threatens to “financially ruin” the entire AI industry if up to 7 million claimants end up joining the litigation and forcing a settlement.
Last week, Anthropic petitioned to appeal the class certification, urging the court to weigh questions that the district court judge, William Alsup, seemingly did not. Alsup allegedly failed to conduct a “rigorous analysis” of the potential class and instead based his judgment on his “50 years” of experience, Anthropic said.
Read the Order, which is Exhibit B to Antrhopic’s appellate brief.
Anthropic admitted that they pirated millions of books like Meta did, in order to create a massive central library for training AI that they permanently retained, and now assert that if they are held responsible for this theft of IP it will destroy the entire AI industry. In other words, it appears that this is common practice in the AI industry to avoid the prohibitive cost of paying for the works they copy. Given that Meta, one of the wealthiest companies in the world, did the same exact thing, it reinforces the understanding that piracy to avoid paying for their libraries is a central component of training AI.
While the lower court did rule that training an LLM on copyrighted material was a fair use, it expressly did not rule that derivative works produced are protected by fair use and preserved the issue for further litigation:
Emphasis added. In other words, Anthropic can still face liability if it’s trained AI produces knockoff works.
Finally, the Court held
Emphasis in original.
So to summarize, Anthropic apparently used the industry standard of piracy to build a massive book library to train it’s LLMs. Plaintiffs did not dispute that training an LLM on a copyrighted work is fair use, but did not have sufficient information to assert that knockoff works were produced by the trained LLMs, and the Court preserved that issue for later litigation if the plaintiffs sought to bring such a claim. Finally, the Court noted that Anthropic built it’s database for training it’s LLMs through massive straight-up piracy. I think my original comment was a fair assessment.