IBM Releases Granite 4 Series Of Mamba-Transformer Language Models - News

IBM Corp. on Thursday open-sourced Granite 4, a language model series that combines elements of two different neural network architectures.

The algorithm family includes four models on launch. They range in size from 3 billion to 32 billion parameters. IBM claims they can outperform comparably-sized models using less memory.

Granite-4.0-Micro, one of the smallest algorithms in the lineup, is based on the Transformer architecture that powers most large language models. The architecture’s flagship feature is its so-called attention mechanism. The mechanism enables an LLM to review a snippet of text, identify the most important sentences and prioritize them during the decision-making process.

The three other Granite 4 models combine an attention mechanism with processing components based on the Mamba neural network architecture, a Transformer alternative. The technology’s main selling point is that it’s more hardware-efficient.

Like Transformer models, Mamba can identify the most important pieces of data in a prompt and adjust its processing accordingly. The difference is that it does so using not an attention mechanism but rather a so-called state space model. That’s a mathematical structure originally used for tasks such as calculating the flight path of spacecraft.

The Transformer architecture’s attention mechanism requires a significant amount of memory to process long prompts. Every time the length of a prompt doubles, the attention mechanism’s RAM usage quadruples. Mamba models require a fraction of the memory, which reduces inference costs.

The Granite 4 series is based on the latest Mamba-2 release of the architecture that debuted early last year. It compresses one of the technology’s core components into about 25 lines of code. That enables Mamba 2 to perform some tasks using less hardware than the original version of the architecture.

The most advanced Granite 4 model, Granite-4.0-H-Small, includes 32 billion parameters. It has a mixture-of-experts design that activates 9 billion parameters to answer prompts. IBM envisions developers using the model for tasks such as processing customer support requests.

The two other Mamba-Transformer algorithms in the series, Granite-4.0-H-Tiny and Granite-4.0-H-Micro, feature 7 billion and 3 billion parameters, respectively. They’re designed for latency-sensitive use cases that prioritize speed over processing accuracy.

IBM compared the memory requirements of Granite-4.0-H-Tiny and its previous-generation Granite 3.3 8B model in an internal benchmark test. The former algorithm used 15 gigabytes of RAM, one-sixth what Granite 3.3 8B required. IBM says that its new models also provide increased output quality.

“While the new Granite hybrid architecture contributes to the efficiency and efficacy of model training, most improvement in model accuracy are derived from advancements in our training (and post-training) methodologies and the ongoing expansion and refinement of the Granite training data corpus,” IBM staffers wrote in a blog post.

Granite 4 is available via IBM’s watsonx.ai service and more than a half-dozen third-party platforms, including Hugging Face. Down the line, the company plans to bring the models to Amazon SageMaker JumpStart and Microsoft Azure AI. IBM also plans to expand the Granite 4 lineup with new algorithms that will offer more advanced reasoning capabilities.

Image: IBM

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About News Media

News Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of News, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — News Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, News Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

IBM releases Granite 4 series of Mamba-Transformer language models – News

Image: IBM

Leave a Reply Cancel reply

Stay Connected

Latest News

No one wants to buy Roomba maker iRobot anymore

Amazon has slashed 20% off this standout Xbox controller

LockBit 5.0 expands targeting amid ransomware escalation | Computer Weekly

Samsung’s First Trifold Phone Might Not Launch In The US – BGR

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Image: IBM

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News