By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: KubeCon NA 2025 – Robert Nishihara on Open Source AI Compute with Kubernetes, Ray, PyTorch, and vLLM
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > KubeCon NA 2025 – Robert Nishihara on Open Source AI Compute with Kubernetes, Ray, PyTorch, and vLLM
News

KubeCon NA 2025 – Robert Nishihara on Open Source AI Compute with Kubernetes, Ray, PyTorch, and vLLM

News Room
Last updated: 2025/11/28 at 2:04 PM
News Room Published 28 November 2025
Share
KubeCon NA 2025 – Robert Nishihara on Open Source AI Compute with Kubernetes, Ray, PyTorch, and vLLM
SHARE

AI workloads are growing more complex in terms of compute and data, and technologies like Kubernetes and PyTorch can help build production-ready AI systems to support them. Robert Nishihara from Anyscale recently spoke at KubeCon + CloudNativeCon North America 2025 Conference about how an AI compute stack comprising Kubernetes, PyTorch, vLLM, and Ray technologies can support these new AI workloads.

Ray is an open-source framework designed to build and scale machine learning and Python applications. It orchestrates infrastructure for distributed workloads and was developed at Berkeley during a reinforcement learning research project. Recently, Ray became part of the PyTorch Foundation to contribute to the broader open-source AI ecosystem.

Nishihara emphasized three main areas driving the evolution of AI workloads: data processing, model training, and model serving. Data processing must adapt to the emerging data types needed for AI applications, expanding beyond traditional tabular data to include multimodal datasets (which can encompass images, videos, audio, text, and sensor data). This evolution is crucial for supporting inference tasks, which are a fundamental component of AI-powered applications. Additionally, the hardware used for data storage and compute operations needs to support GPUs alongside standard CPUs. He noted that data processing has shifted from “SQL operations on CPUs” to “inferences on GPUs.”

Model training involves reinforcement learning (RL) and post-training tasks, including generating new data by running inference on models. Ray’s Actor API can be leveraged for Trainer and Generator components. An “Actor” is essentially a stateful worker that creates a new worker class when instantiated and manages method scheduling on that specific worker instance. Furthermore, Ray’s native Remote Direct Memory Access (RDMA) support allows for the direct transport of GPU objects over RDMA, enhancing performance.

Several open-source reinforcement learning frameworks have been developed on top of Ray. For instance, the AI-powered code editor tool Cursor’s composer is built on Ray. Nishihara also mentioned other notable frameworks, such as Verl (Bytedance), OpenRLHF, ROLL (Alibaba), NeMO-RL (Nvidia), and SkyRL (UC Berkeley), which utilize training engines like Hugging Face, FSDP, DeepSpeed, Megatron, and serving engines like Hugging Face, vLLM, SGLang, and OpenAI, all orchestrated by Ray.

He shared the application architecture behind Ray, noting that increasing complexity exists in both the upper and lower layers. There is a growing need for software stacks that connect applications at the top layer to hardware at the bottom layer. The top layers include AI workloads, model training, and inference frameworks like PyTorch, vLLM, Megatron, and SGLang. In contrast, the bottom layers consist of computing substrates (GPUs and CPUs) and container orchestrators like Kubernetes and Slurm. Distributed compute frameworks such as Ray and Spark act as bridges between these top and bottom tier components, handling data ingestion and data movement.

Kubernetes and Ray complement one another for hosting AI applications, extending container-level isolation with process-level isolation and offering both vertical and horizontal autoscaling. Nishihara pointed out that while the demands for inference stage increase and decrease compared to model training, it becomes beneficial to shift GPUs between these stages, a capability made possible by using Ray and Kubernetes together.

In conclusion, Nishihara underscored the core requirements of AI platforms, which must support a native multi-cloud experience, workload prioritization across GPU reservations, observability and tooling, model and data lineage tracking, and overall governance. Observability is essential at both the container level and the workload and process levels to monitor metrics, such as object transfer speeds.


 

 

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Social media calendar: Top tools and templates for 2025 Social media calendar: Top tools and templates for 2025
Next Article Record Deals on Ninja Slushi, Swirl, and a Flagship Espresso Maker Record Deals on Ninja Slushi, Swirl, and a Flagship Espresso Maker
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

The best sub-0 phone just dropped below 5 for the first time
The best sub-$400 phone just dropped below $285 for the first time
News
I’m a laptop reviewer, and these are the Black Friday deals I’d shop—many are over 0 off
I’m a laptop reviewer, and these are the Black Friday deals I’d shop—many are over $400 off
News
Wake Up—the Best Black Friday Mattress and Bedding Sales Are Here
Wake Up—the Best Black Friday Mattress and Bedding Sales Are Here
Gadget
Apple MacBook Air down £300 to just £699 in the Argos Black Friday sale
Apple MacBook Air down £300 to just £699 in the Argos Black Friday sale
News

You Might also Like

The best sub-0 phone just dropped below 5 for the first time
News

The best sub-$400 phone just dropped below $285 for the first time

2 Min Read
I’m a laptop reviewer, and these are the Black Friday deals I’d shop—many are over 0 off
News

I’m a laptop reviewer, and these are the Black Friday deals I’d shop—many are over $400 off

1 Min Read
Apple MacBook Air down £300 to just £699 in the Argos Black Friday sale
News

Apple MacBook Air down £300 to just £699 in the Argos Black Friday sale

4 Min Read
Tesla FSD V14.2.1 is earning rave reviews from users in diverse conditions
News

Tesla FSD V14.2.1 is earning rave reviews from users in diverse conditions

4 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?