ollama 0.12.11 released this week as the newest feature update to this easy-to-run method of deploying OpenAI GPT-OSS, DeepSeek-R1, Gemma 3, and other large language models. Exciting with ollama 0.12.11 is that it’s now supporting the Vulkan API.
Launching ollama with the OLLAMA_VULKAN=1 environment variable set will now enable Vulkan API support as an alternative to the likes of AMD ROCm and NVIDIA CUDA acceleration. This is great for open-source Vulkan drivers, older AMD graphics cards lacking ROCm support, or even any AMD setup with the RADV driver present but not having installed ROCm. As we’ve seen when testing Llama.cpp with Vulkan, in some cases using Vulkan can be faster than using the likes of ROCm.
This commit in ollama 0.12.11 lays out all the details on the Vulkan API support for ollama. Over the past few weeks ollama offered the Vulkan support as experimental.
The ollama 0.12.11 release also adds API support for Logprobs, support for WebP images within their new app, improved rendering performance, preferring discrete GPUs over integrated GPUs when scheduling models, and various other fixes and enhancements.
Downloads and more details on the ollama 0.12.11 release via GitHub.
