One of the new Linux engineering initiatives out of AMD is working to further enhance Linux performance on today’s large core count systems by introducing push-based load balancing.
AMD’s Linux kernel scheduler work as we end out 2025 is focused on introducing push-based load balancing to address current scheduler issues with the busy load balancing always being tasked to the first CPU of the group balance mask (all the work always ending up on the same single CPU), scalability bottlenecks on large core count systems, and periodic balancing sometimes taking a long time to even out imbalances on systems with large and flat scheduler domain hierarchy.
Proposed code is working through busy balancing optimizations and centralizing “nohz” accounting optimizations. There is also experimental code for the push-based load balancing implementation as well as optimizing intra-NUMA newidle balancing. That is “experimental” right now as there are some known performance regressions being worked through.
With the experimental code, there are some performance regressions compared to the current state of the Linux kernel scheduler but they are currently being discussed and worked through. There are some benchmarks shown in the patch cover letter that highlight some performance benefits already from this improved scheduler code and then the other areas left to address/optimize.
This push-based load balancing code is going to be discussed further in-person later this week at the Linux Plumbers Conference. LPC 2025 is running from 11 to 13 December in Tokyo, Japan. More details on this experimental load-balancing work from AMD for enhancing Linux on large core count servers can be found via this Linux kernel mailing list thread.
