Following the Vulkanised 2025 presentation how NVIDIA is finding great success with Vulkan for AI / machine learning and already competitive to CUDA in some areas, Red Hat engineer and DRM subsystem lead maintainer David Airlie began exploring the potential of Mesa Vulkan drivers for AI inferencing. He was successful in using the Intel ANV, NVIDIA NVK, and Radeon RADV drivers for Vulkan-based AI inferencing while for the Radeon hardware tested is where it’s showing the most potential (performance) at the moment and for even competing with the ROCm compute stack.
David Airlie shared a blog post today outlining his experiences exploring the Mesa Vulkan drivers for AI inferencing .At the same time he and others like Karol Herbst of Red Hat are working on addressing feature gaps in the Mesa Vulkan drivers to make then more suitable for handling AI workloads.
Using the ramalama wrapper to Llama.cpp, Airlie has been tested the different open-source and closed-source driver options. With the Mesa NVK driver the performance is a lot slower still than the official NVIDIA closed-source driver stack. No real surprise, especially given our recent NVK graphics benchmarks in Mesa 25.2 NVK vs. NVIDIA R575 Linux Graphics Performance For GeForce RTX 40 Series
.
On the Intel side he was able to get the Vulkan AI inferencing working with the ANV driver but unable to get the oneAPI/SYCL stack working nicely. On the AMD side is where it currently shows the most potential. Airlie’s results show that the RADV Vulkan performance with ramalama/llama.cpp for token generation could be faster than using the official AMD ROCm compute stack. With prompt processing is where ROCm came out ahead, at least for now but there is hope some Mesa/RADV optimizations could put it ahead of ROCm.
Here is the data shared from Airlie with his GPU driver/hardware comparison:
Read more about his Vulkan AI comparison adventure via his blog. It will be interesting to run some benchmarks on our side once the Mesa Vulkan drivers further mature for AI workloads.