Cloudflare Gives Website Owners Option To Charge OpenAI Bots For Scraping

Cloudflare has launched a private beta feature called Pay per Crawl, whose sole purpose is to let a website owner charge an AI crawler a fixed fee each time the crawler requests a page. The feature addresses a common operational gap: currently, a publisher can either leave all content open to automated collection or block crawlers entirely, and any paid arrangement must be negotiated manually.

At Belitsoft, a software development company for media firms and expert in Cloudflare integrations, we have already explored the legal challenges facing AI companies. See my previous article, OpenAI Data Retention Court Order: Implications for Everybody.

Pay per Crawl replaces that binary choice with an automated, per-request billing step handled entirely at Cloudflare’s edge. The mechanism revives HTTP status code 402, “Payment Required,” an unused element of the original web specification, and uses it to signal that a charge is due before content is served. Cloudflare functions as the merchant of record, so the publisher does not need to integrate a payment gateway, issue invoices, or reconcile receipts. Cloudflare collects funds from the crawler operator and remits them to the publisher on its normal payout schedule.

Deployment is deliberate and straightforward. In the Cloudflare dashboard, the publisher sets a single price that applies to every request for the entire domain. Next, the publisher assigns one of three actions to each known crawler: “Allow” for full, free access; “Charge” to deliver content only if the correct payment intent is present; and “Block” to deny all requests. If a crawler marked “Charge” does not yet have a Cloudflare billing relationship, the request still receives a 402 response, but no content is returned, and the header informs the caller that payment would grant access if the relationship is established later. All routing decisions run after existing WAF, rate limiting, and bot management policies, so the feature does not interfere with the site’s current security posture.

For a crawler operator, participation begins with identity proof. The operator generates an Ed25519 key pair, publishes the public key in JSON Web Key (JWK) format at a known URL, and registers that URL along with the crawler’s user-agent string with Cloudflare. Every request is then signed under the emerging Web Bot Auth standard and carries three headers – Signature-Agent, Signature-Input, and Signature – so the edge can confirm that the message came from the declared crawler and has not been spoofed. Unsigned or malformed requests never proceed to the payment check; they are processed or blocked by the publisher’s existing bot rules as usual.

Once a crawler is recognized, payment negotiation follows one of two flows. In the reactive flow, the crawler makes a normal request, the edge returns a 402 status that includes a crawler-price header with the exact charge in US dollars, and the crawler repeats the request with a crawler-exact-price header containing that figure. If the header matches the configured fee and the signature is valid, Cloudflare serves the content with a 200 OK response and logs a billable event. In the proactive flow, the crawler states a maximum acceptable price in a crawler-max-price header on its first attempt. If the site’s configured fee is at or below that ceiling, the content is served immediately, the actual charge is echoed in a crawler-charged header, and the event is logged. If the fee is higher than the crawler’s ceiling, the edge returns 402 with the posted price. Only one price declaration header – either exact or maximum – may appear in a single request; if both are present or if the header is absent on a “Charge” path, the edge responds with 402.

Accounting is automatic. Each successful paid response is recorded with the authenticated crawler identity and the amount charged. Cloudflare aggregates these entries, debits the crawler operator’s chosen payment method, and credits the publisher. Because Cloudflare is a merchant of record, the publisher sees a single consolidated remittance and does not handle disputes or chargebacks. The workflow is identical whether the site processes a few dozen paid crawls per month or several million.

The beta enforces one flat price for the entire site. Cloudflare’s roadmap includes path-level pricing, dynamic fees based on demand or crawler category, and license distinctions for training, inference, or search, but none of these features are live. Exemptions can be added at any time, so a publisher can grant free access to a research crawler while charging commercial models. The feature can be disabled by removing the rule; doing so reverts the site to its previous open or blocked posture without code changes.

Pay per Crawl therefore creates a predictable commercial framework for automated content access. It adds no local infrastructure, relies on standard HTTP, uses cryptographic signatures for identity, and integrates billing into Cloudflare’s existing edge platform – giving executives a clear path to monetize crawler traffic without negotiating individual contracts or staffing additional operations.

About one fifth of public websites already sit behind Cloudflare, and the company now offers to authenticate web crawlers, negotiate a price through HTTP headers, collect fees, and remit them to site owners. Large publishers such as Condé Nast, TIME, the Associated Press, and others have agreed to block unregistered AI crawlers by default and to rely on Cloudflare for paid access. Crawlers must identify themselves with RFC 9421 cryptographic message signatures, user-agent strings alone no longer enough.

The program exempts Google’s traditional search crawler, reflecting publishers’ continued dependence on Google for traffic. As a result, Google can still train models on its cached pages without direct payment, giving it a competitive cost advantage and reinforcing its market power.

Supporters argue that charging per crawl will fund infrastructure costs, reduce bot traffic, and prompt the largest AI and search companies to cooperate on shared crawling services instead of each fetching the same pages repeatedly. Critics respond that fees may encourage mass production of AI-generated “slop” designed only to earn crawl revenue, raise barriers for smaller AI startups, and strengthen Cloudflare’s position as a private gatekeeper.

Publishers differ in their incentives. Governments, large corporations, and tourism boards often benefit when AI models quote their content, so they may prefer unrestricted crawling. Lawmakers are starting to look at how copyright and antitrust rules should cover AI training. U.S. courts have often said that using content to train AI counts as “fair use,” which weakens publishers’ bargaining power, but the rules are still unclear. Technical fixes help for now, but they won’t solve everything, and full answers will have to wait for new laws or regulations.

Cloudflare Gives Website Owners Option to Charge OpenAI Bots for Scraping | HackerNoon

Leave a Reply Cancel reply

Stay Connected

Latest News

New $2 billion MLB stadium hits major milestone despite fears over deadline

Honor hooks a fridge to his Magic V5 (and it goes)

I’m ready to quit WhatsApp over latest big change to annoying family chats

The most fun way to look through old photos

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News