Whisper.cpp as the open-source high performance inference project built around OpenAI’s Whisper and from the same developers as Llama.cpp / GGML is out with a big new release. Whisper.cpp 1.8.3 is capable of delivering a 12x performance boost for systems with integrated AMD and Intel graphics.
Whisper.cpp 1.8.3 is a very nice step forward for this AI-driven automatic speech recognition using OpenAI’s Whisper model. With the 1.8.3 release there is now proper iGPU support that is capable of delivering a “12x performance boost” as noted in the merge request. This integrated GPU support complements the existing discrete GPU support. The iGPU enablement provides for better hardware utilization, improved debugging, and an all around better user experience.
It is important to note though that with the claimed “12x performance boost” in the merge request it comes down to comparing against CPU-only performance on the same system:
“On AMD Ryzen 7 6800H with Radeon 680M integrated graphics and Intel Core Ultra 7 155H with Intel Arc Graphics, achieved 3-4x better realtime factor compared to CPU-only processing (CPU realtime factor: ~0.3).
This represents approximately 12x speedup compared to CPU-only mode, making integrated GPUs a highly viable option for users without discrete graphics cards.”
This integrated GPU support is making use of the Vulkan API for cross-driver/vendor compatibility.
Whisper.cpp 1.8.3 is available via GitHub and also brings language binding improvements and a variety of other minor improvements. The Ascend Atlas 300I Duo NPU has also been verified now for working with Whisper.cpp.
