By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: KubeCon NA 2025 – Salesforce’s Approach to Self-Healing Using AIOps and Agentic AI
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > KubeCon NA 2025 – Salesforce’s Approach to Self-Healing Using AIOps and Agentic AI
News

KubeCon NA 2025 – Salesforce’s Approach to Self-Healing Using AIOps and Agentic AI

News Room
Last updated: 2025/11/12 at 5:30 PM
News Room Published 12 November 2025
Share
KubeCon NA 2025 – Salesforce’s Approach to Self-Healing Using AIOps and Agentic AI
SHARE

AIOps and Agentic AI technologies can help in developing solutions to intelligently analyze Kubernetes cluster health, automatically diagnose platform problems, and orchestrate issue resolutions with minimal human intervention. Vikram Venkataraman from AWS and Srikanth Rajan from Salesforce spoke on Tuesday at KubeCon + CloudNativeCon North America 2025 Conference about Salesforce’s approach to self-healing systems using AIOps and AI Agents.

The AIOps architecture was developed at Salesforce by the team who develops and supports software to manage infrastructure to to support the Hyperforce Kubernetes Platform, a managed kubernetes platform built on multiple clouds (AWS, GCP, Alicloud) that provides namespace-as-a-service. The operational scale of their K8s platform includes 1400 K8s clusters, millions of pods, thousands of compute nodes, 40+ operators and integrations, and 200+ monitoring plugins. The speakers highlighted that they estimate the capacity to increase five times in the next couple of years. The overall goal of the solution is to let application teams focus on business requirements, not get bogged down with infrastructure overhead.

They discussed the approaches to Kubernetes platform operations, leveraging generative AI and multi-agent collaboration to create a cluster management system to troubleshoot Kubernetes clusters, reducing mean time to identify (MTTI) and mean time to resolve (MTTR) for critical cluster issues. Agentic AI solution consists of AI Agents that have specific goals to help with AIOps platform and tools to retrieve data from telemetry platform. Agents perform actions against their K8s environment like rolling back upgrades in case any issues during the upgrade process.

Venkataraman and Rajan spoke about the challenges of building AI for intelligent operations, such as how different agents should communicate with each other, what guardrails and security permissions the agents must have to perform within the guidelines. They discussed the details of the solution architecture, hosted on AWS cloud platform, which consists of AIOps UI for engineers, Collaborator Agent, Amazon Prometheus and its agent, Amazon EKS, k8sgpt Operator that helps with MTTI metrics, and ArgoCD Controller.

The speakers then shared the details of their tech stack showing different layers with open source technologies as well as home-grown tools:

  • Substrate (Kubernetes cloud platforms like Amazon EKS, self-managed K8s, Google GKE, and Alicloud ACK)
  • Standard Capabilities: Storage, networking, autoscaling, DNS, load balancing, mesh and Ingress. Technologies used in this layer include Istio, Cluster Autoscaler, CSI, OPA, Ingress, CNI, LBC, and CoreDNS.
  • Custom Integrations layer includes capabilities like identity, secrets management, guardrails and log collection.
  • Platform Capabilities layer consists of components for platform abstractions, deployment orchestration, lifecycle automation, visibility & observability, resiliency, cost management, and best practices enforcement. Tools in this layer include Argo, Kyverno, Spinnaker, Helm, Kube Magic Mirror, Sloop, and Periscope.
  • Finally, the API layer provides customer access services and hosts the Control Plane, API’s and self-service portals.

To solve problems like siloed tools, static workflows, limited feedback loop, the team developed AI agents based infrastructure management solution.They started small with a few AI agents like AIops agent (on-call report agent), Kubectl agent that integrates with teams channels in Slack, and translates natural language questions to kubectl commands, providing debugging information on Slack. There is also the Live Site Analysis Agent that automates the weekly platform availability review process by analyzing metrics like SLA misses and generating root cause analysis (RCA) insights.

The speakers suggested progressive autonomy when adopting AI agents based solutions in your own organization. Their initial approach was to include human in the loop to ensure safety and accuracy of the issue resolutions. Once the team gained confidence with AI agents, they started granting more autonomy to agentic solutions.

They concluded the talk by saying the team has just scratched the surface on what AI technologies can do and AI agents can be useful in several other use cases. Their AIOps program roadmap highlighted scaling the AI agents to eliminate 80% of the manual work, a knowedge graph that has all the information to connect the dots of different components in the overall system, and using AI to detect and troubleshoot hard performance problems.

For more information on this and other conference sessions, check out the conference website and the program schedule.

 

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Mesa 25.2.7 Ships The Latest Open-Source OpenGL & Vulkan Driver Fixes Mesa 25.2.7 Ships The Latest Open-Source OpenGL & Vulkan Driver Fixes
Next Article Steve Ballmer’s philanthropy pledges around  billion for early education in Washington state Steve Ballmer’s philanthropy pledges around $1 billion for early education in Washington state
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Building the DevGPT Dataset for Developer–ChatGPT Studies | HackerNoon
Building the DevGPT Dataset for Developer–ChatGPT Studies | HackerNoon
Computing
Airbnb will let you stock the kitchen ahead of time via Instacart
Airbnb will let you stock the kitchen ahead of time via Instacart
News
INSIDER | Takeaways from our first ShanghA.I. Expedition · TechNode
INSIDER | Takeaways from our first ShanghA.I. Expedition · TechNode
Computing
Matthew McConaughey and Michael Caine sign voice deal with AI company
Matthew McConaughey and Michael Caine sign voice deal with AI company
News

You Might also Like

Airbnb will let you stock the kitchen ahead of time via Instacart
News

Airbnb will let you stock the kitchen ahead of time via Instacart

1 Min Read
Matthew McConaughey and Michael Caine sign voice deal with AI company
News

Matthew McConaughey and Michael Caine sign voice deal with AI company

3 Min Read
Steam Users Rejoice – Valve Just Announced Its Own PC – BGR
News

Steam Users Rejoice – Valve Just Announced Its Own PC – BGR

6 Min Read
Google’s new Cameyo trick lets Chromebooks play nice with Windows apps
News

Google’s new Cameyo trick lets Chromebooks play nice with Windows apps

3 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?