Will the generative AI giants be forced to rethink their business model in the near future? This is now a real possibility following the latest Cloudflare initiative, which has just launched a service allowing hosts to charge the tools that collect data on their sites.
To fuel their major language models and their image creation systems, generative AI giants have deployed legions of systems called crawlers. Their mission: survey public websites to harvest a maximum of data that will then be used to cause IA modelsCOME GPT.
This practice, called scrapinghas gradually become a strategic pillar of the economic model of Openai and others. If they were one day deprived of this data source, these companies would have a lot more difficult to continue training their AI models. The latter could then start to stagnate – with direct consequences on their competitiveness and, in the long term, their profitability.
A permanent balance of power
The problem is that scraping tends to be very poorly perceived, especially by creators as well as by many web players. The authors and artists, for example, regularly set out that their works can be used to improve products sold at gold prices, without financial consideration. Hosts, for their part, must adapt to the invasion of these robots which sometimes generate significant and difficult to manage traffic.
In this context of friction, a kind of resistance begins to organize. More and more websites are trying, for example, to cut access to crawlers … and some companies seem to see a real opportunity.
This is particularly the case of Cloudflare, an extremely influential cloud infrastructure provider, whose services are now used by around 20 % of the sites referenced on the public Internet.
Le Pay per Crawl, A new business model
For a few months, the company has deployed several tools allowing hosts to monitor and block crawlers which venture on their field. An initiative that has deprived AI actors of certain qualitative data sources, even if the impact of this line of defense on the industry remains difficult to measure at present.
But this month, Cloudflare has decided to get high speed. The company launched a beta version of a platform called Pay per Crawlwhich allows Invoice a certain sum to robots that try to collect information on their site.
From confrontation to cooperation?
On paper, the idea has everything to please. Data owners could finally benefit from fair remuneration when their content is used to cause models. To defend their own interests in the face of what they sometimes describe as “looting”, they would therefore no longer need to launch costly and not necessarily effective legal actions.
For AI giants, the situation is more nuanced. The prospect of suddenly having to pay for content they get today for free is undoubtedly worrying, for the reasons mentioned above. It would force them to make trenched choices. Should we pay potentially considerable sums to collect quality data, or be content with free content recovered from free access sites, but without guarantee on data quality-with all that implies in terms of reputation and therefore profits?
But on the other hand, this remuneration model could also be beneficial for these companies. By agreeing to pay to access the content they use, they would have the opportunity to show good faith. It would be a way of engage in a process of transparent and respectful cooperationinstead of getting even more boring in the balance of power between industry to actors in which it remains fundamentally dependent. A relationship of trust, in short.
A new era for commercial AI
As often in the business world, the whole issue will be find a happy medium in terms of Price So that everyone can find their account. And Cloudflare is well aware of this.
« At the beginning, the discovery of prices will play a key role: as creators will obtain data on which pays what, a transparent market will emerge, reflecting the true value of the original content “Explains the company cited by Ars Technica.
It remains to be seen if the AI giants will really accept playing the game. Because in practice, the transition to this model would also represent a real change of philosophy: to consider the content of the web no longer as a free resource to be exploited, but as a property with a value that is negotiated. And in the long term, this dilemma could well lead to a deep restructuring of the Internet as we know it today.
🟣 To not miss any news on the Geek newspaper, subscribe to Google News and on our WhatsApp. And if you love us, .