An important set of patches were just merged a few minutes ago to Linux Git for the ongoing Linux 6.19 kernel with some important performance implications.
Intel Fellow Thomas Gleixner yesterday sent in the “core/rseq” pull request for Linux 6.19 Git that was then merged today by Linus Torvalds. This pull includes the patches rewriting the memory-mapped concurrency ID “MM CID” code within the kernel that was found to provide up to a ~14%+ performance improvement for PostgreSQL database throughput. As well, my own testing of this MM CID rewrite also showed very positive gains — on AMD EPYC hardware used for testing. My tests of those patches were covered in Intel’s Rewrite Of Linux MM CID Code Showing Some Nice Gains For AMD.
The pull request includes the improvements to the CID management plus also addresses some issues that came up as a result of the GNU C Library recently landing its Restartable Sequences (RSEQ) usage.
Gleixner explained all the work within the core/rseq pull request:
“A large overhaul of the restartable sequences and CID management:
The recent enablement of RSEQ in glibc resulted in regressions which are caused by the related overhead. It turned out that the decision to invoke the exit to user work was not really a decision. More or less each context switch caused that. There is a long list of small issues which sums up nicely and results in a 3-4% regression in I/O benchmarks.
The other detail which caused issues due to extra work in context switch and task migration is the CID (memory context ID) management. It also requires to use a task work to consolidate the CID space, which is executed in the context of an arbitrary task and results in sporadic uncontrolled exit latencies.
The rewrite addresses this by:
– Removing deprecated and long unsupported functionality
– Moving the related data into dedicated data structures which are optimized for fast path processing.
– Caching values so actual decisions can be made
– Replacing the current implementation with a optimized inlined variant.
– Separating fast and slow path for architectures which use the generic entry code, so that only fault and error handling goes into the TIF_NOTIFY_RESUME handler.
– Rewriting the CID management so that it becomes mostly invisible in the context switch path. That moves the work of switching modes into the fork/exit path, which is a reasonable tradeoff. That work is only required when a process creates more threads than the cpuset it is allowed to run on or when enough threads exit after that. An artificial thread pool benchmarks which triggers this did not degrade, it actually improved significantly.
The main effect in migration heavy scenarios is that runqueue lock held time and therefore contention goes down significantly.”
Great work and this pull plus other performance optimization work expected to be merged for Linux 6.19 will make for some exciting kernel benchmarking in the days and weeks ahead.
