By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Cerebras Systems blazes a trail for AI inference, powering advanced reasoning in real time – News
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Cerebras Systems blazes a trail for AI inference, powering advanced reasoning in real time – News
News

Cerebras Systems blazes a trail for AI inference, powering advanced reasoning in real time – News

News Room
Last updated: 2025/05/15 at 11:18 PM
News Room Published 15 May 2025
Share
SHARE

Artificial intelligence chip startup Cerebras Systems Inc. is heralding the launch of Qwen3-32B, one of the most advanced and powerful open-weight large language models in the world, as proof of its ability to outcompete Nvidia Corp. in AI inference.

The company said its inference platform has achieved the previously impossible — enabling advanced AI reasoning to be performed in less than two seconds, or essentially in real time.

The Qwen3-32B model is available now on the Cerebras Inference Platform, a cloud-based service that the company claims is able to run powerful LLMs at many times faster than comparable offerings based on Nvidia’s graphics processing units.

Cerebras is the creator of a specialized, high-performance computing architecture that runs on dinner plate-sized silicon wafers. Its flagship offering is the Cerebras WSE-3 processor that launched in March 2024, based on a five-nanometer process and featuring 1.4 trillion transistors. It provides more than 900,000 compute cores, meaning it has 52 times more than the number of cores found on a single Nvidia H100 GPU.

The WSE-3 processor also features 44 gigabytes of onboard static random-access memory, solving the major bottleneck associated with Nvidia’s chips – the need for greater memory bandwidth.

Cerebras launched its AI inference service last August. Inference refers to the process of running live data through a trained AI model to make a prediction or solve a task, and high performance is becoming essential to the AI industry as more models and applications move into production.

The company claims that its inference platform is the best in the world, capable of running the most powerful AI models at speeds of more than 2,000 tokens per second, essentially leaving Nvidia-based inference services in the dust. It has previously said that its platform enables Meta Platforms Inc.’s Llama 4 run up to 20 times faster than usual, and makes similar claims about many other top-tier AI models.

Such claims are given credence by a growing list of enterprise customers, including AI model makers such as Mistral AI, and the AI-powered search engine Perplexity AI Inc.

With the launch of Qwen3-32B, Cerebras co-founder and Chief Executive Andrew Feldman says the company’s inference platform is fast enough to “reshape how real-time AI gets built.”

“This is the first time a world-class reasoning model – on par with DeepSeek R1 and OpenAI’s o-series — can return answers instantly,” he said.

Qwen3-32B is one of the most advanced open-weights reasoning models ever created. It was developed by the Chinese cloud computing giant Alibaba Group Holding Ltd., and has shown on benchmarks that it can match the performance of leading closed models such as GPT-4.1 and DeepSeek R1.

When running on the Cerebras Inference Platform, the Qwen3-32B model can perform sophisticated reasoning and generate a response in as little as 1.2 seconds, Cerebras said. That’s more than 60 times faster than the best competing models, including OpenAI’s o3.

The availability of Qwen3-32B on the Cerebras platform paves the way for a new generation of much more responsive AI agents, copilots and automation workloads, the company said. Reasoning models are the most powerful kinds of LLMs, capable of using multistep logic to guide structured decision-making and the use of third-party software tools. They have enormous potential, but they have always struggled with latency, requiring anywhere from 30 to 90 seconds to generate outputs. This latency has always limited their usefulness – until now, Cerebras says.

Even more encouraging is that Cerebras says it’s able to offer rapid access to Qwen3-32B at very reasonable prices, starting at just 40 cents per million input tokens, and 80 cents per million output tokens. According to Feldman, that makes it about 10 times cheaper than OpenAI’s GPT-4.1 model.

“Here we are with an open-source model that’s better than, or equal to, the best models in the world in most metrics,” Feldman said. “And we’re serving it for pennies on the dollar.”

Cerebras is doing its best to encourage developers to give it a go, saying that everybody will receive 1 million free tokens per day to start experimenting, with no waitlist. Because Qwen3-32B is Apache 2.0 licensed, it’s entirely free to use, and can be integrated into applications using OpenAI- and Claude-compatible endpoints, the company said.

Featured image: News/Microsoft Designer

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article SpaceX Revives Battle With EchoStar Over Spectrum It Wants for Cellular Starlink
Next Article Core Technologies Behind Spatial Digital Twins | HackerNoon
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

What’s New to Max Streaming This Week (May 16-23)
News
Save $100 on the Garmin Forerunner 265 with this top smartwatch deal
Gadget
Zeekr’s privatization will save “several billion yuan” in R&D · TechNode
Computing
Take your tunes on the go with the JBL Go 4 — now $10 off at Amazon
News

You Might also Like

News

What’s New to Max Streaming This Week (May 16-23)

5 Min Read
News

Take your tunes on the go with the JBL Go 4 — now $10 off at Amazon

2 Min Read
News

Security tests reveal serious vulnerability in government’s One Login digital ID system | Computer Weekly

8 Min Read
News

Fortnite is blocked from Apple’s US App Store… again

4 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?