A Linux scheduler patch queued up into a TIP branch this past week further restrict is the preemption modes that will be advertised. With it hitting the “sched/core” branch, it will likely be submitted for the upcoming Linux 7.0 (or alternatively, what could be known as Linux 6.20 instead).
Longtime Intel Linux engineer Peter Zijlstra authored the patch to further restrict the preemption modes of the Linux kernel in balancing between throughput and latency of the system. Peter explained with the patch:
“The introduction of PREEMPT_LAZY was for multiple reasons:
– PREEMPT_RT suffered from over-scheduling, hurting performance compared to !PREEMPT_RT.
– the introduction of (more) features that rely on preemption; like folio_zero_user() which can do large memset() without preemption checks. (Xen already had a horrible hack to deal with long running hypercalls)
– the endless and uncontrolled sprinkling of cond_resched() — mostly cargo cult or in response to poor to replicate workloads.
By moving to a model that is fundamentally preemptable these things become managable and avoid needing to introduce more horrible hacks.
Since this is a requirement; limit PREEMPT_NONE to architectures that do not support preemption at all. Further limit PREEMPT_VOLUNTARY to those architectures that do not yet have PREEMPT_LAZY support (with the eventual goal to make this the empty set and completely remove voluntary preemption and cond_resched() — notably VOLUNTARY is already limited to !ARCH_NO_PREEMPT.)
This leaves up-to-date architectures (arm64, loongarch, powerpc, riscv, s390, x86) with only two preemption models: full and lazy.
While Lazy has been the recommended setting for a while, not all distributions have managed to make the switch yet. Force things along. Keep the patch minimal in case of hard to address regressions that might pop up.”
Basically blocking out the none and voluntary preemption models from modern CPU architectures. x86/x86_64, s390, RISC-V, POWER, LoongArch, and ARM64 will be focused on the full and lazy preemption models.
Barring any issues from coming up or stakeholder objections, this patch in tip/tip.git’s sched/core branch will then likely be merged for the upcoming Linux 6.20~7.0 cycle.
