At the end of April I reported on a significant performance regression affecting newer AMD CPUs and was bisected to a change in the AMD SRSO mitigation handling for Zen 4/5 processors with the Linux 6.15 kernel. The fix for that significant performance regression was merged today ahead of the imminent Linux 6.15-rc6 release.
After spotting a sizable performance regression earlier in April, this AMD performance regression affecting newer AMD processors ended up being quite significant. Simply if the KVM kernel module for virtualization support was loaded but unused for running any virtual machines (VMs), some costly mitigations ended up being applied that weren’t needed. The involved performance costs ended up being more significant than the upstream kernel developers from AMD and Google realized at the time.
There was a patch posted in early May to address this regression while now for Linux 6.15-rc6 there is a good enough solution that’s been merged.
Merged today via the KVM fixes for Linux 6.15 was KVM: SVM: Set/clear SRSO’s BP_SPEC_REDUCE on 0 <=> 1 VM count transitions.
Set the magic BP_SPEC_REDUCE bit to mitigate SRSO when running VMs if and only if KVM has at least one active VM. Leaving the bit set at all times unfortunately degrades performance by a wee bit more than expected.
Use a dedicated spinlock and counter instead of hooking virtualization enablement, as changing the behavior of kvm.enable_virt_at_load based on SRSO_BP_SPEC_REDUCE is painful, and has its own drawbacks, e.g. could result in performance issues for flows that are sensitive to VM creation latency.
Defer setting BP_SPEC_REDUCE until VMRUN is imminent to avoid impacting performance on CPUs that aren’t running VMs, e.g. if a setup is using housekeeping CPUs. Setting BP_SPEC_REDUCE in task context, i.e. without blasting IPIs to all CPUs, also helps avoid serializing 1<=>N transitions without incurring a gross amount of complexity (see the Link for details on how ugly coordinating via IPIs gets).
Link: https://lore.kernel.org/all/[email protected]
Fixes: 8442df2b49ed (“x86/bugs: KVM: Add support for SRSO_MSR_FIX”)
Reported-by: Michael Larabel
Closes: https://www..com/review/linux-615-amd-regression
Thus now Linux 6.15 out-of-the-box without running any KVM VMs will no longer tank the system performance… If running with virtual machines on your system you will have the performance hit there due to the security enforcement, but at least simply booting the Linux kernel and not caring about any VMs will no longer cause the big performance penalty observed earlier on Linux 6.15 Git.