By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Intel llm-scaler-vllm Beta 1.2 Brings Support For New AI Models On Arc Graphics
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Intel llm-scaler-vllm Beta 1.2 Brings Support For New AI Models On Arc Graphics
Computing

Intel llm-scaler-vllm Beta 1.2 Brings Support For New AI Models On Arc Graphics

News Room
Last updated: 2025/12/11 at 5:37 AM
News Room Published 11 December 2025
Share
Intel llm-scaler-vllm Beta 1.2 Brings Support For New AI Models On Arc Graphics
SHARE

Following yesterday’s release of a new llm-scaler-omni beta there is now a new beta feature release of llm-scaler-vllm that provides the Intel-optimized version of vLLM within a Docker container that is set and ready to go for AI on modern Arc Graphics hardware. With today’s llm-scaler-vllm 1.2 beta release there is support for a variety of additional large language models (LLMs) and other improvements.

Going the route of llm-scaler-vllm continues to be Intel’s preferred choice for customers to leverage vLLM for AI workloads on their discrete graphics hardware. With this new llm-scaler-vllm 1.2 beta release there is support for new models and other enhancements to benefit the Intel vLLM experience:

– Fix 72-hour hang issue
– MoE-Int4 support for Qwen3-30B-A3B
– Bpe-Qwen tokenizer support
– Enable Qwen3-VL Dense/MoE models
– Enable Qwen3-Omni models
– MinerU 2.5 Support
– Enable whisper transcription models
– Fix minicpmv4.5 OOM issue and output error
– Enable ERNIE-4.5-vl models
– Enable Glyph based GLM-4.1V-9B-Base
– Attention kernel optimizations for decoding phases for all workloads (>10% e2e throughput on 10+ models with all in/out seq length)
– Gpt-oss 20B and 120B support in mxfp4 with optimized performance
– MoE models optimizations, output throughput:Qwen3-30B-A3B 2.6x e2e improvement; DeeSeek-V2-lite 1.5x improvement.
– New models: added 8 multi-modality models, image/video are supported.
– vLLM 0.10.2 with new features: P/D disaggregation(experimental), tooling, reasoning output, structured output,
– fp16/bf16 gemm optimizations for batch size 1-128. obvious improvement for small batch sizes.
– Bug fixes

This work will be especially important for next year’s Crescent Island hardware release.

Intel AI Software

More details on the new beta release via GitHub while the llm-scaler-vllm Docker container is available via the Docker Hub container image library.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Europe’s best security guarantee against Russia is the Ukrainian army Europe’s best security guarantee against Russia is the Ukrainian army
Next Article Clair Obscur: Expedition 33 – how a tiny studio developed the Belle Époque-set gaming blockbuster Clair Obscur: Expedition 33 – how a tiny studio developed the Belle Époque-set gaming blockbuster
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Framework says it will not ‘gouge customers like Dell’ in RAM price crisis
Framework says it will not ‘gouge customers like Dell’ in RAM price crisis
News
With a Spike in RAM Prices, Now Might Be the Best Time to Buy a Laptop
With a Spike in RAM Prices, Now Might Be the Best Time to Buy a Laptop
Gadget
Swift’s #Predicate Explained: How Type-Safe Filtering Works in SwiftData | HackerNoon
Swift’s #Predicate Explained: How Type-Safe Filtering Works in SwiftData | HackerNoon
Computing
5 Little Known Facts About Your Smart Home – BGR
5 Little Known Facts About Your Smart Home – BGR
News

You Might also Like

Swift’s #Predicate Explained: How Type-Safe Filtering Works in SwiftData | HackerNoon
Computing

Swift’s #Predicate Explained: How Type-Safe Filtering Works in SwiftData | HackerNoon

0 Min Read
The TechBeat: Exploiting EIP-7702 Delegation in the Ethernaut Cashback Challenge — A Step-by-Step Writeup (12/11/2025) | HackerNoon
Computing

The TechBeat: Exploiting EIP-7702 Delegation in the Ethernaut Cashback Challenge — A Step-by-Step Writeup (12/11/2025) | HackerNoon

7 Min Read
Apple’s Design Lightning Rod Just Joined Meta. What Now? | HackerNoon
Computing

Apple’s Design Lightning Rod Just Joined Meta. What Now? | HackerNoon

10 Min Read
KDE Gear 25.12 Released For Shipping The Latest KDE Applications
Computing

KDE Gear 25.12 Released For Shipping The Latest KDE Applications

1 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?