The rules of AI training have just changed. By default, Cloudflare will now block AI scrapers across the millions of websites it protects (roughly 24 per cent of all sites on the internet). Any AI company seeking to crawl a Cloudflare-hosted site will have to obtain explicit permission from the content owner. This is the first infrastructure-level defense of its kind.
The company also launched Pay Per Crawl, a monetization tool that allows publishers to charge AI firms for data access. This includes the ability to set terms and prices for bot traffic. Early adopters include major publishers like Gannett, Time, and Stack Overflow.
The new service targets large language model builders like OpenAI, Google, Meta, and Anthropic, many of whom have been accused of scraping copyrighted content without consent. Google’s current crawl-to-traffic ratio is 18:1. OpenAI’s is 1,500:1. Publishers have begun to push back with lawsuits and licensing deals.
Cloudflare’s chief strategy officer, Stephanie Cohen, framed the new policy as a way to restore balance to the internet economy. The company reports more than one million domains already opted in before the default block was enacted.
This changes the economics of AI training. Free data is no longer guaranteed. If you run a site behind Cloudflare, your content is now protected by default. If you build AI tools, the bill for training data just went up.
As always your thoughts and comments are both welcome and encouraged. -s
About Shelly Palmer
Shelly Palmer is the Professor of Advanced Media in Residence at Syracuse University’s S.I. Newhouse School of Public Communications and CEO of The Palmer Group, a consulting practice that helps Fortune 500 companies with technology, media and marketing. Named he covers tech and business for , is a regular commentator on CNN and writes a popular . He's a , and the creator of the popular, free online course, . Follow or visit .