By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: KubeCon NA 2025 – Robert Nishihara on Open Source AI Compute with Kubernetes, Ray, PyTorch, and vLLM
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > KubeCon NA 2025 – Robert Nishihara on Open Source AI Compute with Kubernetes, Ray, PyTorch, and vLLM
News

KubeCon NA 2025 – Robert Nishihara on Open Source AI Compute with Kubernetes, Ray, PyTorch, and vLLM

News Room
Last updated: 2025/11/28 at 2:04 PM
News Room Published 28 November 2025
Share
KubeCon NA 2025 – Robert Nishihara on Open Source AI Compute with Kubernetes, Ray, PyTorch, and vLLM
SHARE

AI workloads are growing more complex in terms of compute and data, and technologies like Kubernetes and PyTorch can help build production-ready AI systems to support them. Robert Nishihara from Anyscale recently spoke at KubeCon + CloudNativeCon North America 2025 Conference about how an AI compute stack comprising Kubernetes, PyTorch, vLLM, and Ray technologies can support these new AI workloads.

Ray is an open-source framework designed to build and scale machine learning and Python applications. It orchestrates infrastructure for distributed workloads and was developed at Berkeley during a reinforcement learning research project. Recently, Ray became part of the PyTorch Foundation to contribute to the broader open-source AI ecosystem.

Nishihara emphasized three main areas driving the evolution of AI workloads: data processing, model training, and model serving. Data processing must adapt to the emerging data types needed for AI applications, expanding beyond traditional tabular data to include multimodal datasets (which can encompass images, videos, audio, text, and sensor data). This evolution is crucial for supporting inference tasks, which are a fundamental component of AI-powered applications. Additionally, the hardware used for data storage and compute operations needs to support GPUs alongside standard CPUs. He noted that data processing has shifted from “SQL operations on CPUs” to “inferences on GPUs.”

Model training involves reinforcement learning (RL) and post-training tasks, including generating new data by running inference on models. Ray’s Actor API can be leveraged for Trainer and Generator components. An “Actor” is essentially a stateful worker that creates a new worker class when instantiated and manages method scheduling on that specific worker instance. Furthermore, Ray’s native Remote Direct Memory Access (RDMA) support allows for the direct transport of GPU objects over RDMA, enhancing performance.

Several open-source reinforcement learning frameworks have been developed on top of Ray. For instance, the AI-powered code editor tool Cursor’s composer is built on Ray. Nishihara also mentioned other notable frameworks, such as Verl (Bytedance), OpenRLHF, ROLL (Alibaba), NeMO-RL (Nvidia), and SkyRL (UC Berkeley), which utilize training engines like Hugging Face, FSDP, DeepSpeed, Megatron, and serving engines like Hugging Face, vLLM, SGLang, and OpenAI, all orchestrated by Ray.

He shared the application architecture behind Ray, noting that increasing complexity exists in both the upper and lower layers. There is a growing need for software stacks that connect applications at the top layer to hardware at the bottom layer. The top layers include AI workloads, model training, and inference frameworks like PyTorch, vLLM, Megatron, and SGLang. In contrast, the bottom layers consist of computing substrates (GPUs and CPUs) and container orchestrators like Kubernetes and Slurm. Distributed compute frameworks such as Ray and Spark act as bridges between these top and bottom tier components, handling data ingestion and data movement.

Kubernetes and Ray complement one another for hosting AI applications, extending container-level isolation with process-level isolation and offering both vertical and horizontal autoscaling. Nishihara pointed out that while the demands for inference stage increase and decrease compared to model training, it becomes beneficial to shift GPUs between these stages, a capability made possible by using Ray and Kubernetes together.

In conclusion, Nishihara underscored the core requirements of AI platforms, which must support a native multi-cloud experience, workload prioritization across GPU reservations, observability and tooling, model and data lineage tracking, and overall governance. Observability is essential at both the container level and the workload and process levels to monitor metrics, such as object transfer speeds.


 

 

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Social media calendar: Top tools and templates for 2025 Social media calendar: Top tools and templates for 2025
Next Article Record Deals on Ninja Slushi, Swirl, and a Flagship Espresso Maker Record Deals on Ninja Slushi, Swirl, and a Flagship Espresso Maker
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

The Steven Spielberg Movie That Predicted Our Modern Tech With Creepy Accuracy – BGR
The Steven Spielberg Movie That Predicted Our Modern Tech With Creepy Accuracy – BGR
News
Black Friday deals on chargers and power banks we love from Anker and more
Black Friday deals on chargers and power banks we love from Anker and more
News
Grab the Belkin Quick Charge stand at an unbeatable price of just .50 (42% off)
Grab the Belkin Quick Charge stand at an unbeatable price of just $14.50 (42% off)
News
I’ve reviewed hundreds of coffee machines, here’s what I’d buy on Black Friday
I’ve reviewed hundreds of coffee machines, here’s what I’d buy on Black Friday
Gadget

You Might also Like

The Steven Spielberg Movie That Predicted Our Modern Tech With Creepy Accuracy – BGR
News

The Steven Spielberg Movie That Predicted Our Modern Tech With Creepy Accuracy – BGR

4 Min Read
Black Friday deals on chargers and power banks we love from Anker and more
News

Black Friday deals on chargers and power banks we love from Anker and more

1 Min Read
Grab the Belkin Quick Charge stand at an unbeatable price of just .50 (42% off)
News

Grab the Belkin Quick Charge stand at an unbeatable price of just $14.50 (42% off)

2 Min Read
Black Friday deals slash the cost of smart rings by over 0 — the 5 incredible deals I’d shop at Amazon and Best Buy
News

Black Friday deals slash the cost of smart rings by over $100 — the 5 incredible deals I’d shop at Amazon and Best Buy

1 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?