Merged last week for Linux 6.17 were the FUTEX locking changes that include addressing an observed performance bottleneck.
Prominent Linux engineer Peter Zijlstra at Intel adapted the FUTEX code to use a RCU-based, per-CPU reference counting to address a performance bottleneck found within the existing code that used a single instance variant.
Peter explained with the patch addressing the FUTEX performance bottleneck:
“The use of rcuref_t for reference counting introduces a performance bottleneck when accessed concurrently by multiple threads during futex operations.
Replace rcuref_t with special crafted per-CPU reference counters. The lifetime logic remains the same.
The newly allocate private hash starts in FR_PERCPU state. In this state, each futex operation that requires the private hash uses a per-CPU counter (an unsigned int) for incrementing or decrementing the reference count.”
The FUTEX improvements were merged last week to Linux 6.17 Git as part of the locking/futex pull request.