Over the past year Intel engineers have worked a lot on Cache Aware Scheduling for the Linux kernel. The yet-to-be-merged functionality allows for the Linux kernel to better aggregate tasks sharing data to the same last level cache (LLC) domain to reduce cache misses and cache bouncing. The Cache Aware Scheduling development was led by Intel but helps other CPU vendors too for processors with multiple cache domains. Back in October I showed some nice performance wins for AMD EPYC Turin with Cache Aware Scheduling while today’s article are some benchmarks of the newest CAS code and looking at the performance benefit on Xeon 6 “Granite Rapids” processors.
The October round of Cache Aware Scheduling tests were done on a dual AMD EPYC 9965 server and showed nice results in a variety of workloads from this proposed kernel code. Thanks to Giga Computing having sent over the Gigabyte R284-A92-AAL1 barebones server a while back to address my failed Granite Rapids AP reference server platform, I am now able to run some Cache Aware Scheduling benchmarks with the Xeon 6900P series processors.
Last week Intel engineers posted the latest iteration of the Cache Aware Scheduling code (as “v2” in their post-RFC form). Using the Gigabyte R284-A92-AAL1 server with two Intel Xeon 6980P processors and 24 x 64GB DDR5-8800 MRDIMM memory, I ran some benchmarks comparing the performance impact of the Cache Aware Scheduling code patched into the kernel and enabled.
I tested against the cache-aware-v2 Git branch of those patches published last week with that Git branch currently tracking against the Linux 6.18-rc7 kernel. The Cache Aware Scheduling performance was compared to mainline Linux 6.18.7 without cache aware scheduling present. The Intel Xeon 6P server was running Ubuntu 25.10 otherwise with its default packages like GCC 15.2 besides swapping to the Linux 6.18 based kernel.
