Linux 6.18-rc5 To Cut Down Performance Regression Observed On IBM POWER CPUs

Merged today ahead of the Linux 6.18-rc5 kernel due out on Sunday is a partial fix for a performance regression observed on IBM POWER hardware.

Since the “IMMUTABLE” flag was dropped from the kernel’s FUTEX code for the Linux 6.17 cycle, IBM engineers have noted a performance regression primarily affecting their hardware. Now for Linux 6.18-rc5 that performance regression is at least cut in half.

Intel engineer Peter Zijlstra worked out the partial fix/workaround by optimizing the per-CPU reference ocunting in the Futex code. Zijlstra explained with the now-merged patch:

“Shrikanth noted that the per-cpu reference counter was still some 10% slower than the old immutable option (which removes the reference counting entirely).

Further optimize the per-cpu reference counter by:

– switching from RCU to preempt;

– using __this_cpu_*() since we now have preempt disabled;

– switching from smp_load_acquire() to READ_ONCE().

This is all safe because disabling preemption inhibits the RCU grace period exactly like rcu_read_lock().

Having preemption disabled allows using __this_cpu_*() provided the only access to the variable is in task context — which is the case here.

Furthermore, since we know changing fph->state to FR_ATOMIC demands a full RCU grace period we can rely on the implied smp_mb() from that to replace the acquire barrier().

This is very similar to the percpu_down_read_internal() fast-path.

The reason this is significant for PowerPC is that it uses the generic this_cpu_*() implementation which relies on local_irq_disable() (the x86 implementation relies on it being a single memop instruction to be IRQ-safe). Switching to preempt_disable() and __this_cpu*() avoids this IRQ state swizzling. Also, PowerPC needs LWSYNC for the ACQUIRE barrier, not having to use explicit barriers safes a bunch.

Combined this reduces the performance gap by half, down to some 5%.”

This improvement was merged to the Linux 6.18 Git code today as the sole change of this week’s locking/urgent pull request.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Leave a Reply