Artificial intelligence deployments aren’t just adding load to the cloud — they’re reshaping it from the inside out. As graphics processing unit clusters scale and inference workloads multiply, the abstraction layers that once made cloud computing feel seamless are straining under real-time performance demands, turning AI observability into a requirement, not a feature.
Current demand come at an inflection point that may eclipse the first cloud-native wave. But AI isn’t merely another application category; it’s a fundamentally different workload model — one that exposes blind spots in monitoring, data movement and coordination across compute, storage and networking, according to Chen Goldberg (pictured), senior vice president at CoreWeave Inc. The question now isn’t whether AI will reshape infrastructure, but how cloud architectures must adapt to support technologies that operate at a fundamentally different scale and velocity.
“We’re in a similar but probably bigger moment [than the cloud-native wave]. There is yet a new technology,” Goldberg said. “We have large language models. We have AI models. And then what can you do with them? How can you really unleash innovation and just see what’s possible?”
Goldberg spoke with theCUBE’s Dave Vellante and Rebecca Knight at Vast Forward 2026, during an exclusive broadcast on theCUBE, News Media’s livestreaming studio. They discussed how, under AI workloads, the cracks are starting to show for cloud 1.0 — and what comes next. (* Disclosure below.)
Why AI observability changes the stakes
If developers and researchers are going to trust AI outcomes, observability can’t be an afterthought bolted onto existing systems, according to Goldberg, who worked on cloud infrastructure at Google during the early Kubernetes era. That means pinpointing bottlenecks, tracing how data feeds into GPUs and measuring the real-world performance of training and inference jobs in production.
“What [CoreWeave has] seen with AI workloads — because compute matters, storage matters, network matters — is we need to make sure that if I’m running a job, a training job, an AI application, I can understand what’s happening across the stack and make decisions and be proactive about it,” Goldberg said. “That’s one example of something that we really thought about as a principle.”
CoreWeave, a GPU-focused cloud provider, built its infrastructure specifically for AI training and inference workloads. Rather than layering AI services onto a traditional hyperscale foundation, the company designed its cloud with that AI observability as in mind. That architectural choice reflects a broader truth: AI environments evolve constantly, and every layer of the system has to evolve with them, according to Goldberg.
“Every piece of the system keeps changing. We have a new generation of GPUs, we have new storage solutions, we have data solutions, we have literally a new model every day,” she said. “[The question is] how do I build a system that is flexible enough without compromising resiliency and without compromising security?”
With GPUs, storage and models evolving in parallel, complexity can quickly become the bottleneck. CoreWeave tackled the issue by mastering its stack, simplifying its application programming interface and building around a deliberately streamlined architecture, Goldberg explained.
“The way we solved for that is [by] knowing our stack,” she said. “We’ve built a new distributed caching mechanism that sits behind our S3-compatible API. As a customer, I can use … CoreWeave AI object storage; but under the hood, we are optimizing the way we are accessing the data and our customers achieve up to seven gigabytes per second per GPU.”
Here’s the complete video interview, part of News’s and theCUBE’s coverage of Vast Forward:
(* Disclosure: TheCUBE is a media partner for Vast Forward. Sponsors of theCUBE’s coverage, including presenting sponsor Solidigm, do not have editorial control over content on theCUBE or News.)
Photo: News
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
- 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
- 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About News Media
Founded by tech visionaries John Furrier and Dave Vellante, News Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.
