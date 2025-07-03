Data for AI systems has become an increasingly contentious issue. OpenAI, Anthropic, Google and other companies building AI systems have amassed reams of information from across the internet to train their AI models. High-quality data is particularly prized because it helps AI models become more proficient in generating accurate answers, videos and images.

But website publishers, authors, news organisations and other content creators have accused AI companies of using their material without permission and payment. Last month, Reddit sued Anthropic, saying the startup had unlawfully used the data of its more than 100 million daily users to train its AI systems. In 2023, The New York Times sued OpenAI and its partner, Microsoft, accusing them of copyright infringement of news content related to AI systems. OpenAI and Microsoft have denied those claims.

Some publishers have struck licensing deals with AI companies to receive compensation for their content. In May, the Times agreed to license its editorial content to Amazon for use in the tech giant’s AI platforms. Axel Springer, Condé Nast and News Corp. have also entered into agreements with AI companies to receive revenue for the use of their material.

Mark Howard, the chief operating officer of Time, said he welcomed Cloudflare’s move. Data scraping by AI companies threatens anyone who creates content, he said, adding that news publishers like Time deserved fair compensation for what they published.

Still, what Cloudflare is enabling “is really just the very, very first step in what will be a very long process,” he said. “But you have to start somewhere, and you have to start at some time.”

OpenAI, Anthropic and Google did not respond to requests for comment.

Cloudflare began considering how to help online publishers about 18 months ago, Prince said. For the past few decades, getting people to go to their websites was how publishers and content creators made money, he said. But AI has changed those dynamics, with people increasingly turning to AI tools like ChatGPT instead of a search engine or a primary source article.

Prince said he was “deeply concerned that the incentives for content creation are dead”. Last July, Cloudflare introduced an optional setting to allow website publishers to block AI scrapers if they wanted. That led to the default setting Tuesday.

AI companies that do not pay for content will ultimately lose out on access to it, Prince said.

“I am 100% confident we can block them from accessing the content,” he said. “And if they don’t get access to the content, then their product will be worse.”

This article originally appeared in The New York Times.

Written by: Natallie Rocha

Photographs by: Jason Henry

©2025 THE NEW YORK TIMES