Going on for several months now has been an effort to wire up cache-aware scheduling / load balancing for the Linux kernel for helping out task placement for processors with multiple cache domains such as modern AMD Ryzen/EPYC and Intel Xeon platforms. This cache-aware scheduling has shown much potential for Linux with further enhancing performance on today’s interesting CPUs. Out today is the third iteration of cache-aware scheduling with an important rework.
Intel engineer Tim Chen today posted the “v3” patches of cache-aware scheduling for aggregating tasks sharing data to the same cache domain. The focus remains on optimizing performance by reducing cache misses, lowering cache bouncing, and improving overall data access efficiency.
Tim Chen explained of the fundamental improvement made for today’s v3 code:
“In previous versions, aggregation of tasks were done in the wake up path, without making load balancing paths aware of LLC (Last-Level-Cache) preference. This led to the following problems:
1) Aggregation of tasks during wake up led to load imbalance between LLCs
2) Load balancing tried to even out the load between LLCs
3) Wake up tasks aggregation happened at a faster rate and load balancing moved tasks in opposite directions, leading to continuous and excessive task migrations and regressions in benchmarks like schbench.
In this version, load balancing is made cache-aware. The main idea of cache-aware load balancing consists of two parts:
1) Identify tasks that prefer to run on their hottest LLC and move them there.
2) Prevent generic load balancing from moving a task out of its hottest LLC.”
Benchmarks shown in the v3 cover letter are indicating some nice gains in different benchmarks such as the Hackbench scheduler test, Schbench, and other workloads — mostly synthetic tests being used to evaluate the initial impact.
The Cache-Aware Scheduling v3 patches have been based atop the Linux 6.15 kernel source tree. Those interested in this cache-aware scheduling work for the Linux kernel can see the patch series that remains under a Request For Comments flag. Hopefully it manages to be in shape for mainlining in the Linux kernel in the months ahead.