Alibaba Releases Multimodal Qwen3.5 Mixture Of Experts Model - News

Alibaba Group Holding Ltd. today released an artificial intelligence model that it says can outperform GPT-5.2 and Claude 4.5 Opus at some tasks.

The new algorithm, Qwen3.5, is available on Hugging Face under an open-source license.

By default, Qwen3.5 is capable of processing prompts with up to 262,144 tokens. Developers can nearly quadruple that number by applying customizations. Prompts may include text in more than 210 languages and dialects along with images such as data visualizations.

Qwen3.5 is a mixture of experts model, which means that comprises multiple neural networks that are optimized for different tasks. When the LLM receives a prompt, it uses 10 of its neural networks to generate an answer. Activating only some components of a model to process prompts is more hardware-efficient than running input through all its artifial neurons. Qwen3.5 has a total of 397 billion parameters, 17 billion of which are used per prompt.

Alibaba has also equipped the model with several other optimizations designed to boost its efficiency.

A large language model’s attention heads, the mechanisms that it uses to determine which data points to take into account when making a decision, usually scale quadratically. That means doubling the amount of data in a prompt quadruples the amount of RAM needed to produce a response. Qwen3.5 combines standard quadratic attention heads with so-called linear attention heads, which require considerably less memory.

The model also uses another efficiency-boosting technology called a gated delta network. The technology combines two deep learning techniques known as gating and the delta rule.

Gating enables an LLM to remove data that it doesn’t need for a task from its memory, which lowers hardware usage. The delta rule, in turn, is a version of the back-propagation algorithm that LLMs use to learn new tasks during training. It streamlines how the model updates its parameters during the learning process. Last year, Nvidia Corp. researchers determined that combining the two methods reduces the amount of hardware needed to train LLMs.

Alibaba compared Qwen3.5 to GPT-5.2 and Claude 4.5 Opus across more than 30 benchmarks. The model outperformed both on IFBench, a test that measures how well LLMs follow user instructions. In other cases, Qwen3.5 bested one of the LLMs but not the other. For example, it topped the score that Claude 4.5 Opus set on the HMMT reasoning benchmark but fell behind GPT-5.2.

Alibaba says that Qwen3.5 is also adept at processing multimodal data. It outperformed Qwen3-VL, a model built specifically for image analysis tasks, across several visual reasoning and coding benchmarks.

Image: Alibaba

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About News Media

News Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of News, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — News Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, News Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.