The future of agentic artificial intelligence — intelligent systems that act autonomously on behalf of humans — is coming into focus, and two companies are shaping how it takes form inside the enterprise. IBM Corp. and Groq Inc. today announced a strategic partnership that brings together IBM’s watsonx Orchestrate, the company’s enterprise-grade agent orchestration and automation platform, with Groq’s language processing unit and GroqCloud inference infrastructure.
This collaboration marks a pivotal shift in AI infrastructure and orchestration. It combines IBM’s governance, hybrid interoperability and workflow orchestration with Groq’s deterministic, compiler-driven speed — allowing enterprise AI agents to perform at human-level responsiveness across regulated and hybrid environments.
Watch this interview clip with me and Groq Chief Executive Jonathan Ross (picetured) on theCUBE at the Raise Summit this summer: Token as a Service and the Future of Compute
IBM’s play: Orchestrating trustworthy agentic workflows
IBM’s watsonx Orchestrate has quietly evolved into one of the most sophisticated agentic AI platforms in the market. It allows nontechnical users to build and deploy multi-agent workflows using plain-English instructions — automating tasks across HR, customer service, finance and operations.
What sets Orchestrate apart is its semantic control plane, which decomposes goals into tasks, coordinates multi-agent collaboration and executes across hybrid environments spanning on-prem, public cloud and software-as-a-service systems. Built-in AgentOps capabilities provide lifecycle management, observability and policy-based governance, ensuring compliance and control even in mission-critical deployments.
IBM leaders told me their goal is to make agentic apps trustworthy, auditable and composable, with a foundation rooted in open source (via Red Hat vLLM) and integration across the watsonx stack, including watsonx.data. The platform’s reach extends even to IBM Z and LinuxONE, bringing agentic automation to mainframes used by banks, insurers and governments.
“We’re not just automating workflows — we’re orchestrating intelligence,” an IBM executive said. “Agentic AI means thousands of specialized AI agents working in concert under enterprise-grade governance.”
Groq’s edge: Determinism, simplicity and speed
While IBM focuses on orchestration, Groq’s differentiator is its deterministic architecture. Its LPU doesn’t rely on the complex, dynamic scheduling used by GPUs. Instead, it uses a compiler-driven approach that pre-schedules every operation in advance, eliminating runtime overhead and enabling clock-cycle-level predictability.
This deterministic design translates into up to 10× faster inference performance and sub-millisecond response times. Combined with GroqCloud and GroqRack, enterprises can now deploy inference systems that are not only faster but simpler, more energy-efficient and easier to manage.
When I asked Ross how he defines the company — chip company, systems company or something else — he replied: “We try to avoid labels. We do token as a service, but we also sell hardware.”
Watch clip →
Ross has been evangelizing the importance of inference since long before it was fashionable. “We’ve been doing inference since 2016, back before it was popular,” he told me. “Inference is the killer app.”
Watch clip →
Today, that early conviction has paid off. Groq’s LPU and compiler model have made it one of the fastest-growing infrastructure companies in the AI ecosystem, with enterprise and government clients deploying GroqCloud for real-time AI in sectors from healthcare to trading and robotics.
Product-led growth and the new compute supply chain
Ross’s go-to-market strategy reflects Groq’s confidence in its technology.
“Product-led growth,” he explained. “Our CMO likes to say, the fastest way to kill a bad product is good marketing. If your product can’t hold up, marketing kills it fast. We have a great product, so our goal is to get it out and let people play with it.”
Watch clip →
That philosophy is working. During one meeting Ross recounted, a chief technology officer was so impressed he immediately turned to his colleague and said, “Can you start benchmarking tonight?” The response: “Groq’s my default — I use it for everything.”
But speed isn’t just about inference— it’s also about delivery. In the global scramble for compute capacity, Groq is cutting through the backlog that plagues GPU suppliers.
“Compute security is becoming as important as energy security,” Ross said. “When you order GPUs, it can take 24 months. Jensen [Huang] even said at GTC, if you want compute in two years, sign the PO today. We can deliver in about six months because our supply chain is much simpler. If you want to catch up in AI, you have to think in months, not years.”
Watch clip →
Why the IBM-Groq partnership matters
By bringing Groq’s deterministic performance into the watsonx Orchestrate ecosystem, IBM can now offer AI agents that think and act in real time. Enterprise customers will have access to GroqCloud inference directly within Orchestrate, enabling instant analysis, decisioning, and automation.
The partnership also extends to integrating Red Hat vLLM with Groq’s LPU stack —bridging open-source inference orchestration with Groq’s ultra-fast hardware layer. This makes it easier for developers to migrate existing AI applications, including retrieval-augmented generation and vector database workloads, to GroqCloud with minimal code changes.
From a broader lens, this collaboration unites enterprise orchestration, open hybrid architectures and deterministic compute — three pillars that could define the next generation of AI infrastructure.
“When enterprises go into production, they must ensure complex workflows can be deployed successfully,” said Rob Thomas, IBM’s senior vice president of software and chief commercial officer. “Our partnership with Groq underscores our commitment to helping clients achieve business value from AI — reliably and at scale.”
The big picture: From experiments to execution
The IBM-Groq partnership reflects a broader industry pivot from experimentation to execution. AI isn’t just about model training anymore — it’s about how fast and reliably inference can happen, how agentic systems collaborate and how enterprises can govern them.
In that context, IBM and Groq represent two halves of the same equation: orchestration and speed, trust and performance, governance and determinism.
“Inference is the heartbeat of AI,” Ross told me. “If you can’t run models instantly, you can’t make agents act intelligently. Determinism is what makes agentic systems viable.”
As AI agents move into regulated industries, the ability to act in milliseconds while maintaining transparency and compliance will become the defining capability. IBM and Groq are betting that this combination — agentic intelligence plus deterministic speed — will usher in the next era of enterprise AI.
Photo: News
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
- 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
- 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About News Media
Founded by tech visionaries John Furrier and Dave Vellante, News Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.