Back in August Intel released LLM-Scaler 1.0 as part of Project Battlematrix for help getting generative AI “GenAI” workloads running on Arc (Pro) B-Series graphics cards. Out today are two new LLM Scaler beta releases for further enhancing the AI capabilities on Intel Battlemage GPUs.
First up is the LLM-Scaler vLLM 1.1 preview as the latest for running vLLM on Intel Battlemage GPUs. The only listed change with this beta though is a bug fix for sym_int4 online quantization with multi-modal models.
The new release is available on Duckerhub via intel/llm-scaler-vllm:1.1-preview. Docker containers continues to be Intel’s sole focus for deploying LLM Scaler for GenAI workloads on Battlemage hardware.
Also released today is llm-scaler-omni beta release 0.1.0-b1. This LLM-Scaler Omni beta integrates ComfyUI on XPU support with a focus on Wan2.2 TI2V 5B, Wan2.2 T2V 14B, FLUX.1, Stable Diffusion 3.5 large, and Qwen models. There is also now XPU support for xDit, Yunchang, and Raylight plus a variety of other improvements.