Among the changes that landed this week for the Linux 6.15 merge window were all of the memory management “MM” updates, of which there are several notable patch series included.
Andrew Morton sent in all of the MM changes this week for Linux 6.15. There are a few exciting changes to find with this next version of the Linux kernel when it comes to memory management.
A new command-line option to control how many threads should be used to allocate huge pages. This can help “significantly” reduce boot time by tuning the parallelization of huge page initialization. This new command line option is “hugetlb_alloc_threads” and ideal if wanting to speed-up boot performance when allocating a large number of huge pages. The hugetlb_alloc_threads default is 25% of the available hardware threads. Cyberus Tech engineers found as much as a 2.75~4.3x speed-up for large servers:
Another interesting patch series as part of this memory management churn is making the huge page allocator more reliable. Huge page allocations should be more reliable with less fragmentation while also being cheaper thanks to these patches. Johannes Weiner explained in the prior patch series on this work that goes back two years:
“As memory capacity continues to grow, 4k TLB coverage has not been able to keep up. On Meta’s 64G webservers, close to 20% of execution cycles are observed to be handling TLB misses when using 4k pages only. Huge ages are shifting from being a nice-to-have optimization for HPC workloads to becoming a necessity for common applications.
However, while trying to deploy THP more universally, we observe a fragmentation problem in the page allocator that often prevents larger requests from being met quickly, or met at all, at runtime. Since we have to provision hardware capacity for worst case performance, unreliable huge page coverage isn’t of much help.
…
In a broad sample of Meta servers, we find that unmovable allocations make up less than 7% of total memory on average, yet occupy 34% of the 2M blocks in the system. We also found that this effect isn’t correlated with high uptimes, and that servers can get heavily fragmented within the first hour of running a workload.”
Sepatately with this MM pull request, the Z3fold and Zbud allocators have been removed. Z3fold and Zbud were already deprecated and now being removed entirely from the upstream kernel.
Some of the other “MM” changes in Linux 6.15 include:
– DAMON fixes and various improvements there… Most notable among the DAMON work this cycle is adding an automatic tuning feature for DAMONs’ aggregation interval tuning by using a feedback loop.
– Batched unmap lazy-free large folios during reclamation to speed-up the unmapping of PTE-mapped large folios.
– A patch series to re-implement per-VMA locks as a refcount can yield 0~10% improvement in at least one micro-benchmark.
– HugeTLB and CMA (Continuous Memory Allocator) improvements for large systems/servers.
– ZRAM has been extended to run compression and decompression operations in a preemptible manner.
– Lazy MMU mode fixes for x86, SPARC, and POWER.
– Improves to Heterogeneous Memory Management (HMM) around various fixes for device-exclusive entries.
– Proactive memory reclaim statistics are now reported by the kernel.
More details via the MM pull that was already merged to Linux 6.15 Git.