A patch series posted overnight that is part of a larger planned rework for the kernel to introduce a “Swap Table” is poised to bring significant real-world performance gains to the Linux kernel.
Kairui Song of Tencent posted the initial phase of patches for introducing a Swap Table and for it to serve as the kernel’s swap cache. This stems from an idea raised by Kairui back during a LSF/MM/BPF talk for integrating swap cache, swap maps, and swap allocator functionality within the Linux kernel. This redesign of the swap code is to be more future-proof and yield lower memory usage than the current code as well as higher performance.:
“People have been complaining about the SWAP management subsystem. Many incremental workarounds and optimizations are added, but causes many other problems and making implementing new features more difficult. One reason is the current design almost has the minimal memory usage (1 byte swap map) with acceptable performance, so it’s hard to beat with incremental changes. But actually as more code and features are added, there are already lots of duplicated parts. So I’m proposing this idea to overhaul whole SWAP slot management from a different aspect, as the following work on the SWAP allocator.”
Posted overnight were the phase one patch series for introducing the proposed Swap table infrastructure and using it as the swap cache back-end. It’s showing very nice results already:
“This phase I contains 9 patches, introduces the swap table infrastructure and uses it as the swap cache backend. By doing so, we have up to ~5-20% performance gain in throughput, RPS or build time for benchmark and workload tests.
…
Testing has shown that phase I has a significant performance improvement from 8c/1G ARM machine to 48c96t/128G x86_64 servers in many practical workloads.”
Some damn nice performance results with a VM scalability benchmark showing more than a 20% improvement in most scenario, Linux kernel build times up to a few percent faster, and Redis/Valkey in-memory database enjoying around 6~7% higher throughput.
Very nice start for these swap table patches. Hopefully this work ultimately is deemed acceptable for upstreaming into the mainline Linux kernel and the rest of the swap table patch series phases come about soon.