Back in August was the announcement of LLM-Scaler as part of Project Battlematrix. LLM-Scaler is a new Intel software project to provide optimized AI inference capabilities on Intel graphics hardware. A new beta release of LLM-Scaler “llm-scaler-vllm” is now available with expanded LLM model coverage.
Since the original August debut there have been more releases of this Docker-based LLM-Scaler solution for delivering expanded model coverage and other new features geared for Battlemage GPUs. Out today is a new llm-scaler-vllm release to once again expand the scope of supported large language models.
The new version today is llm-scaler-vllm beta release 0.10.2-b5. Significant with this updated Docker image is now supporting OpenAI’s GPT-OSS models for inferencing with Intel Arc (Pro) B-Series GPUs. The GPT-OSS support should now be in good shape with this LLM-Scaler solution for Intel GPUs.
The updated LLM-Scaler also now enables the Qwen3-VL series and Qwen3-Omni series models too. That’s all for the listed changes with today’s beta release.
Those wanting to grab the new Intel LLM-Scaler-vLLM beta release can find the details on GitHub.
