Intel’s open-source software developers released today OpenVINO 2024.5 as the newest major feature release for this cross-platform AI toolkit.
OpenVINO 2024.5 continues building up the Generative AI “GenAI” capabilities of this toolkit and broadening the large language model (LLM) coverage. OpenVINO 2024.5 adds support for Llama 3.2 with the 1B and 3B sizes, Gemma 2 with both the 2B and 9B sizes, and YOLO11. There is also now large language model support on Intel NPUs for Llama 3 8B, Llama 2 7B, Mistral-v0.2 7B, Qwen2 7B Instruct, and Phi-3.
When it comes to optimizations, there are new optimizations for Intel Core Ultra graphics as well as Intel Arc Graphics discrete GPUs. There is also now official support for Intel Xeon 6 P-core processors, i.e. Granite Rapids, plus support for Intel Core Ultra 200V Arrow Lake S desktop processors.
OpenVINO 2024.5 also offers preview-level support for Flex, a high performance Python neural network library based on JAX. Rounding out OpenVINO 2024.5 is adding speculative decoding functionality to the GenAI API, preview support for GenAI API support for multi-modal AI deployments with multi-modal pipelines, and GenAI API support for LLMs on Intel NPUs.
OpenVINO 2024.5 is quite a big update for this AI toolkit. More details on the changes along with downloads via GitHub. I’ll be working on new OpenVINO benchmarks of the 2024.5 release soon.