By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: KubeCon NA 2025 – Erica Hughberg and Alexa Griffith on Tools for the Age of GenAI
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > KubeCon NA 2025 – Erica Hughberg and Alexa Griffith on Tools for the Age of GenAI
News

KubeCon NA 2025 – Erica Hughberg and Alexa Griffith on Tools for the Age of GenAI

News Room
Last updated: 2025/11/17 at 2:06 PM
News Room Published 17 November 2025
Share
KubeCon NA 2025 – Erica Hughberg and Alexa Griffith on Tools for the Age of GenAI
SHARE

Generative AI technologies need to support new workloads, traffic patterns, and infrastructure demands and require a new set of tools for the age of GenAI. Erica Hughberg from Tetrate and Alexa Griffith from Bloomberg spoke last week at KubeCon + CloudNativeCon North America 2025 Conference about what it takes to build GenAI platforms capable of serving model inference at scale.

The new requirements for Gen AI based appilcations include dynamic, model-based routing, token-level rate limiting, secure & centralized credential management, and observability, resilience & failover for AI. Existing tools are not sufficient to support these use cases due to their lack of AI-native logic, simple rate limiting, and request based routing. Kubernetes platform and tools like KServe, vLLM, Envoy and llm-d can be used to implement these new requirements. And for monitoring and observability of AI applications, we can leverage frameworks like OpenTelemetry, Prometheus, and Grafana.

The speakers discussed their AI application architecture developed using open source projects like Envoy AI Gateway and KServe. Envoy AI Gateway helps manage traffic at the edge and provides unified access from application clients to GenAI services like Inference Service or Model Context Protocol (MCP) Server. Its design is based on a two-tier gateway pattern with Tier One Gateway, referred to as AI Gateway, functioning as a centralized entry point and is responsible for authentication, top-level routing, unified LLM API, and token-based rate limiting. It can also acts as a MCP proxy. 

And the Tier Two Gateway, referred to as Reference Gateway, manages the ingress traffic to the AI models hosted on a Kubernetes cluster and is also responsible for fine-grained control to access the models. Envoy AI Gateway supports different AI providers like OpenAI, Azure OpenAI, Google Gemini, Vertex AI, AWS Bedrock, and Anthropic.

KServe is the open-source standard for self-hosted models, providing a unified platform for generative and predictive AI inference on Kubernetes platform. As a single, declarative API for models, it can provide a stable, internal endpoint for each model to which the Envoy AI Gateway can route traffic. It’s recently been retooled to support Generative AI capabilities like LLM multi-framework support, OpenAI-compatible APIs, LLM model caching, KV cache offloading, multi-node inference, metric-based autoscaling, and native support for Hugging Face models with streamlined deployment workflows.

KServe provides a Kubernetes custom resource definition (CRD), built on the foundation of llm-d, a Kubernetes-native LLM inference framework, for serving the models on different frameworks like PyTorch, TensorFlow, ONNX, or HuggingFace. The CRD’s K8s configuration YAML script includes the type InferenceService where we can specify the model metadata and gateway API for external access.

Hughberg and Griffith concluded the presentation by reiterating that GenAI brings stateful, resource-intensive, and tokenbased workloads. You will need AI-native capabilities like dynamic, model-based routing, and token-level rate limiting & cost control. CNCF tools like Kubernetes, Envoy AI Gateway, and KServe can help with developing Gen AI based applications.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article 5 potential successors for Tim Cook at Apple 5 potential successors for Tim Cook at Apple
Next Article Revolutionizing Supply Chain Efficiency: Nitin Agarwal’s PreCheck AI Yard Check-In Camera System | HackerNoon Revolutionizing Supply Chain Efficiency: Nitin Agarwal’s PreCheck AI Yard Check-In Camera System | HackerNoon
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Ford Dealers Are Now Listing Their Used Cars on Amazon
Ford Dealers Are Now Listing Their Used Cars on Amazon
News
Legal startup tackles the hidden risk of shifting contract terms –  News
Legal startup tackles the hidden risk of shifting contract terms – News
News
NetChoice sues Virginia to block its one-hour social media limit for kids
NetChoice sues Virginia to block its one-hour social media limit for kids
News
Wayland-Only Budgie 10.10 Desktop Preview Released
Wayland-Only Budgie 10.10 Desktop Preview Released
Computing

You Might also Like

Ford Dealers Are Now Listing Their Used Cars on Amazon
News

Ford Dealers Are Now Listing Their Used Cars on Amazon

4 Min Read
Legal startup tackles the hidden risk of shifting contract terms –  News
News

Legal startup tackles the hidden risk of shifting contract terms – News

8 Min Read
NetChoice sues Virginia to block its one-hour social media limit for kids
News

NetChoice sues Virginia to block its one-hour social media limit for kids

3 Min Read
The Amazon Fire TV Stick 4K Select is only  as an early Black Friday deal
News

The Amazon Fire TV Stick 4K Select is only $10 as an early Black Friday deal

2 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?