Surya is a 366M-parameter model created by IBM and NASA to forecast solar activity, including flare events, solar winds, and precursors to solar eruptions, which can significantly impact astronaut safety in space as well as terrestrial systems such as communications, power distribution, and more.
IBM and NASA trained Surya on nine years of full-resolution (4096×4096 pixel) images from NASA’s Solar Dynamic Observatory (SDO) satellite, captured with a 12 minutes cadence. This dataset enables Surya to learn general-purpose solar representations that capture both fine- and large-scale events and their temporal variability.
IBM and NASA researchers emphasize that the new model marks a departure from previous work with narrowly focused, task-specific models, representing instead a more versatile approach to heliophysics.
Current ML applications in heliophysics research often depends on task-specific data and models trained from scratch, which can be inefficient, prone to overfitting, and limited by the scarcity of labeled data, especially for rare events Ahmadzadeh et al. (2019), which are often the most interesting ones.
Despite not being task-specific, Surya outperformed existing specialized models, including U-Net for solar region segmentation, AlexNet for solar flare forecasts, and both AlexNet and ResNet50 for solar wind speed forecasting.
From an architectural perspective, Surya employs a 2-D transformer augmented with two spectral gating blocks, eight long–short attention blocks, and a decoder block for reconstruction in the physical domain.
The two spectral gating blocks combine frequency-domain filtering with learnable weights and adaptive re-weighting of spectral components to effectively suppress noise while enhancing relevant features in the data. The long-short attention blocks enable the model to capture fine-grained local dependencies, long-range correlations, and multi-scale representations for more comprehensive understanding of the data. The decoder block map features back to the physical domain while preserving spatial structure and channel relationships. Full details of the transformations at each stage are available in the referenced paper.
The researchers maintain that Surya is capable of developing representations that appear to be physics-aware to some extent, rather than merely memorizing past patterns. This is suggested by its ability to forecast solar dynamics without additional training.
Surya is available on Hugging Face and GitHub.