“Meta has treated the so-called 'public availability' of the shadow dataset as a get-out-of-jail-free card, despite internal Meta records showing every relevant decision-maker at Meta, even its CEO Mark Zuckerberg also knew Liebgen The plaintiffs allege in the motion that this was ''a dataset that we know is pirated.'' (originally filed in late 2024, motion third amended Complaint Request to enter.)
Apart from the plaintiffs' brief, another filing was not redacted in response to Chhabria's order – Meta's Oppose On the motion to file an amended complaint. It argues that the authors' efforts to add additional claims to the case are an “eleventh-hour gamble based on a false and inflammatory premise” and denies that Meta had waited to reveal important information in the discovery. Instead, Meta argues that it first told the plaintiffs that it used the Libgen dataset in July 2024. (Since much of the research material remains confidential, it is difficult for WIRED to confirm that claim.)
Meta's argument hinges on its claim that the plaintiffs were already aware of Libgen's use and should not be given additional time to file a third amended claim when they would have had time to do so before discovery ended in December 2024. There was enough time for. “Plaintiffs knew about Meta's downloading and use of Libgen and other alleged 'shadow libraries' since at least mid-July 2024, lawyers for the tech giant Discussion,
In November 2023, Chhabria granted Meta's motion to dismiss certain claims of the lawsuit, including Meta's alleged use of authors' work to train AI. Digital Millennium Copyright ActAn American law introduced in 1998 to prevent people from selling or duplicating copyrighted works on the Internet. judge at that time Agreed Meta's stance is that the plaintiffs have not provided sufficient evidence to prove that the company removed “copyright management information” (CMI) such as author names and titles of works.
The unreleased documents argue that the plaintiffs should be allowed to amend their complaint, alleging that the information disclosed by Meta is evidence that the DMCA claim was justified. They also say the investigative process has revealed reasons to add new charges. The motion alleges, “Meta, through a corporate representative who testified on November 20, 2024, is now accused of uploading (a.k.a. 'seeding') pirated files containing Plaintiff's works to 'torrent' sites under oath. ) has been accepted.” (Seeding occurs when torrented files are shared with other peers after they have been downloaded.)
“This torrenting activity turned Meta into a distributor of the same pirated copyrighted content that it was also downloading for use in its commercially available AI models,” claims one of the newly unpublished documents, alleging that What Meta did not do, in other words, was not only use copyrighted material but also disseminate it without permission.
Libgen, a collection of books uploaded to the Internet that originated in Russia around 2008, is one of the largest and most controversial “shadow libraries” in the world. In 2015, a New York judge ordered A preliminary injunction against the site followed, a measure theoretically designed to temporarily shut down the archive, but its unidentified administrators simply changed its domain. In September 2024, a different New York judge ordered Libgen will have to pay $30 million to the rights holders for violating their copyrights, even though it is not known who actually operates the piracy hub.
In this case, Meta's search problems are also not over. In the same vein, Chhabria warned the tech giant against any over-broad modification requests in the future: “If Meta again submits an unreasonably broad sealing request, all the content will simply be unsealed,” he said. wrote.