Training AI with Pirated Books Despite Warnings

AI Development Controversy

Meta Platforms (NASDAQ: META), the parent company of Facebook and Instagram, is currently embroiled in a significant legal battle. Recent court filings reveal that the tech giant proceeded with the use of thousands of copyrighted books to train its artificial intelligence models, even after receiving legal warnings against such actions. This development has placed Meta at the centre of consolidated lawsuits from high-profile authors, including comedian Sarah Silverman and Pulitzer Prize winner Michael Chabon.

Chat Logs Unveiled: Ignoring Legal Advice

Key evidence has emerged in the form of chat logs, suggesting Meta's awareness of potential legal issues related to the use of these books. These logs feature discussions between Meta researcher Tim Dettmer’s and the company's legal department, questioning the legality of using book files as training data for AI models. The logs, part of a Discord server conversation, highlight the company's internal debate over the legal ramifications of their actions.

Impact on the Tech Industry: A Wave of Lawsuits

This revelation comes at a time when tech companies are facing a barrage of lawsuits from content creators. These creators accuse the companies of using copyrighted works without permission to build generative AI models, sparking a worldwide sensation and an investment frenzy. If these lawsuits succeed, they could significantly increase the costs of developing data-intensive models, forcing AI companies to compensate artists, authors, and other content creators.

Meta's AI Models

Meta publicly released the first version of its large language model, Llama, in February, disclosing a list of datasets used for training, including "the Books3 section of ThePile." The dataset reportedly contains 196,640 books. However, Meta did not reveal the training data for its latest model, Llama 2, which is available for commercial use.

Regulatory Changes and Future Risks

These developments coincide with new provisional rules in Europe that regulate artificial intelligence. These rules could compel companies to disclose the data used in training their models, potentially increasing legal risks. Llama 2's release, free for companies with fewer than 700 million monthly active users, could disrupt the market for generative AI software, challenging the dominance of companies like OpenAI and Google (NASDAQ: GOOGL) that charge for their models' usage.

Conclusion: A Turning Point for AI and Copyright Law

Meta's situation represents a pivotal moment in the intersection of AI development and copyright law. The outcomes of these legal battles could reshape the landscape of AI technology and its relationship with intellectual property rights.

11 May 2026

Alphabet is gaining momentum, its results are exceeding expectations, and Nvidia is already feeling the pressure at the top of the market

Alphabet entered May 2026 as a company that is no longer just a stable leader in digital advertising, but increasingly one of the main winners of the AI cycle. The latest quarterly results showed stronger revenue growth, a sharp acceleration in cloud growth, and rising profitability, which immediately shifted the discussion from the usual earnings report to the question of whether Alphabet could become the world’s most valuable company.

16 April 2026

Amazon to Buy Globalstar for $11.57 Billion, Launching a Battle with Starlink

Amazon has taken a significant strategic step by agreeing to acquire satellite operator Globalstar for $11.57 billion. This deal pushes the company deeper into the satellite connectivity segment, where Starlink already dominates, and also shows that the largest tech companies want to control not only the cloud and artificial intelligence but also the infrastructure through which these services will operate on a global scale. [1]

Strictly necessary
Strictly necessary means that essential functions of the Website can not be provided without using them. Because these cookies are essential for the properly working and secure of Website features and services, you cannot opt-out of using these technologies. You can still block them within your browser, but it might cause the disfunction of basic website features. •Setting privacy preferences •Secure log in •Secure connection during the usage of services •Filling forms
Performance and Functionality
Analytics and performance tracking technologies to analyze how you use the Website. •Most viewed pages •Interaction with content •Error analysis •Testing and Measuring various design effectivity
Advertising and Marketing
The Website may use third-party advertising and marketing technologies. •Promote our services on other platforms and websites •Measure the effectiveness of our campaigns

🍪 Cookies

Cookies settings

Cookie Control

What are cookies?

If Enabled

If Disabled

Strictly necessary

Performance and Functionality

Advertising and Marketing