With 20 times fewer computing resources than its competitors, a model born at INRIA Paris beats operational weather forecasting systems. And its code is open access.
Weather forecasting using artificial intelligence has become, in three years, a field of competition between giants. Google DeepMind launched GraphCast in 2023, then GenCast in 2025. The European Center for Medium-Range Weather Forecasts (ECMWF, based in Bonn) has been running its IFS ENS model for decades. To compete with these behemoths, you typically need colossal GPU farms and seven-figure research budgets. A team from INRIA in Paris has just demonstrated the opposite. Their work, published on April 22 in Science Advancespresent ArchesWeatherGen, a probabilistic model that outperforms IFS ENS and NeuralGCM (the hybrid physics-AI approach) on almost all benchmark weather variables, with the exception of geopotential where NeuralGCM maintains the advantage.
How to do better than Google with a fraction of its means
The trick lies in a simple idea, well executed. Instead of training a generative model directly on raw atmospheric data (which costs a fortune in computing), the team proceeded in two steps. The first involves training a classic deterministic model, ArchesWeather, which predicts the average state of the atmosphere 24 hours later. The second uses a technique called “flow matching” (a modern variant of diffusion models, the same ones that are used to generate images) to model what the deterministic model does not capture: the amount of uncertainty, the alternative scenarios, the trajectories that the atmosphere could have followed.
On the WeatherBench benchmark, a test that measures the accuracy of predictions against real data from a year (the year 2020 in this case), ArchesWeatherGen exceeds IFS ENS by 5.3% on average on the CRPS score, the reference metric for probabilistic forecasts, on all key variables up to 10 days of forecasting. Faced with Google’s GenCast trained on a 111 km grid, the French model performs slightly less well in the short term (1 to 3 days) but takes the advantage from 4 days onwards. And faced with the high definition version of GenCast, which cuts the atmosphere into 28 km mesh and costs twenty times more to train, ArchesWeatherGen converges to comparable performances at 9 and 10 days of forecasting.
An accessible model that opens the door to university labs
ArchesWeatherGen’s total training budget represents 45 days of calculation on GPU V100or about 23 days on newer A100 cards. To give an order of magnitude, Google’s GenCast requires more than 1,000 V100 days. The training dataset cuts the globe into 167 km meshes and weighs only one terabyte, compared to 36 times more for the fine 28 km grid used by the competition. All code, pre-trained models, and data pipeline are released on GitHub under an open license.
The team behind this work is led by Guillaume Couairon (since moved to Google DeepMind Paris) and Claire Monteleoni, as part of the “Choose France” chair in artificial intelligence. The calculation was carried out on the French infrastructures of GENCI and IDRIS, the CNRS computing center. We might as well say it clearly: this is French public research, financed by public funds, which produces a model competitive with those of the best-endowed laboratories on the planet.
ArchesWeatherGen will not tomorrow replace Météo-France’s operational models, which run at much finer resolutions. But it proves that a university lab can now produce world-class probabilistic forecasts on an affordable budget.
👉🏻 Follow tech news in real time: add 01net to your sources on Google, and subscribe to our WhatsApp channel.
Source :
Science Advances/INRIA
