June 21, 2024

Meta Used Copyrighted Books for AI Training Despite Knowing the Implications

Meta’s legal turmoil over the use of copyrighted material in training its AI models only seems to be getting worse, with authors now alleging that the company was well aware of the legal implications.

A new filing in the ongoing copyright infringement lawsuit against the tech giant indicates that its own lawyers had warned against using copyrighted material for AI training purposes.

The lawsuit in question was initially filed against Meta this summer, by comedian Sarah Silverman, Pulitzer Prize-winning author Michael Chabon, and other authors. The new development further strengthens their claims against the company behind Facebook and Instagram.

The latest complaint in the lawsuit, which was filed on Monday, claims that Meta went against its lawyers’ warnings to train its AI LLM model Llama on copyrighted books. The complaint includes the chat logs of Tim Dettmers, a Meta-affiliated researcher in a Discord server.

The logs show Dettmers describe going back and forth with the tech giant’s legal team over whether he is legally allowed to use the book files to train the AI model.

At Facebook, there are a lot of people interested in working with (T)he (P)ile, including myself, but in its current form, we are unable to use it for legal reasons.Tim Dettmers

According to the complaint, the dataset that Dettmers was referring to had been used by Meta to train the first version of Llama.

The chat logs also reveal that a few months before this conversation, the researcher wrote that the company’s lawyers had specifically told him “the data” could not be used. AI models built using the said data can’t be published, they added.

Though Dettmers didn’t mention what exactly the legal team was worried about, other researchers in the chat described “books with active copyrights” as the biggest concern.

They also went on to say that the data should be usable for AI training under the Fair Use doctrine, which makes unlicensed uses of copyrighted works for certain purposes.

To put it simply, the chat logs indicate that Meta apparently acknowledged the legal uncertainties regarding the usage of copyrighted books. This new filing is a major piece of evidence that puts the company at a significant disadvantage.

It’s worth noting that the new complaint comes after District Judge Vince Chhabria asked the plaintiffs to amend their complaints last month. Dismissing the authors’ allegations that the text generated by Llama was in violation of their copyrights, the judge expressed his doubt regarding the idea of Llama generating text output that resembles their works.

However, Meta was yet to challenge the core claim of the authors, i.e., the tech giant had violated their copyrights by training Llama on their books.

Chhabria allowed the authors to proceed with the lawsuit by amending most of their claims but said that he’d dismiss them once again if the amended claims failed to show how Llama’s output was similar to the authors’ works.

This is only one of the many lawsuits filed against tech companies like Meta this year, accusing them of infringing copyrighted materials to create popular AI models.

If successful, these lawsuits can potentially slow down the AI craze since compensation for artists, authors, and other content creators would add to the cost of building AI models.

free coins
free coinsfree coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins
free coins

Leave a Reply

Your email address will not be published. Required fields are marked *