A nice Christmas surprise for 2024 was Meta engineer Rik van Riel posting Linux kernel patches for making use of the AMD INVLPGB instruction found since Zen 3 processors for broadcast TLB invalidation.
The synthetic benchmark of the AMD INVLPGB showed a very significant boost:
Benchmarks I’ve run of the AMD INVLPGB patches have also showed a nice uptick in performance for this feature now found across Zen 3 / Zen 4 / Zen 5 systems both for Ryzen client and EPYC server processors.
Over the holidays the patches continued to be worked on and over the weekend a fourth iteration of the AMD INVLPGB patches were posted for review.
Rik van Riel commented on the v4 patches that they address “a large amount of feedback and debate” around the prior patches. The updated patches now use only bitmaps to track free global ASIDs, improve AMD initialization, various improvements to the code and documentation, and fixes for possible subtle race conditions.
These patches continue looking quite nice and will hopefully be mainlined into the Linux kernel in 2025. I’ll be running more AMD INVLPGB benchmarks as this work moves closer to the upstream tree. Those wanting to try out the v4 patches can find them on the Linux kernel mailing list.