At KubeCon+CloudNativeCon North America 2024, Nandhakumar Venkatachalam and Payal Patel shared Yahoo’s Kubernetes journey from on-premises to multi-cloud at scale, underscoring challenges faced and lessons learned during this transition.
Venkatachalam began by discussing Yahoo’s initial infrastructure and motivations for adopting Kubernetes. He highlighted the scale of Yahoo’s operations, which includes over 70 clusters and more than 100,000 pods. He further explained:
Our journey to Kubernetes was driven by the need for greater scalability, flexibility, and efficiency in our infrastructure management.
The speakers outlined Yahoo’s cloud-native path, which involved using GitHub Enterprise for source control, and Screwdriver, an open-source build platform for continuous delivery. Also, it included an internal OCI registry for artifacts and a combination of on-premise and cloud Kubernetes clusters.
Patel then delved into the challenges Yahoo faced during the transition, including networking, managing the complexity of a hybrid infrastructure, ensuring security and compliance across multiple environments, optimizing performance and cost in a multi-cloud setup, and training teams to work with Kubernetes.
The presentation highlighted three key areas where Yahoo improved its security controls:
- Software composition analysis was introduced during continuous integration to detect and automatically remediate vulnerabilities in open-source dependencies.
- A build-time vulnerability assessment was integrated to scan images, including base images and programming libraries.
- Production deployment verification was added to check images before deployment for provenance, signature, and freshness.
Venkatachalam demonstrated these security controls in action, showing how policies can be used to allow or reject deployments based on various criteria.
The speakers shared several lessons learned throughout Yahoo’s journey:
- Plan for adoption from the outset
- Embrace open-source technologies
- Continuously seek feedback to optimize developers’ workflow and experience
- Invest in automation to manage complexity at scale
- Prioritize security and compliance throughout the transition
Patel emphasized the importance of cultural change alongside technological transformation:
Moving to Kubernetes isn’t just about adopting new technology; it’s about changing how we think about infrastructure and application deployment. It requires a shift in mindset across the entire organization.
The session also touched on Yahoo’s approach to managing multi-cloud environments, highlighting the importance of standardization and automation in maintaining consistency across different cloud providers. They discussed tools and practices that helped them achieve a unified approach to deployment and management across their hybrid infrastructure.
Performance optimization was another key topic, with the speakers sharing insights on how they fine-tuned their Kubernetes clusters for Yahoo’s specific workloads. This included strategies for resource allocation, autoscaling, and load balancing that helped them maximize efficiency and minimize costs.
The presenters also addressed the challenges of data management in a multi-cloud Kubernetes environment, discussing their approach to data replication, backup, and recovery across different cloud platforms.
Looking to the future, Venkatachalam and Patel outlined Yahoo’s plans for further optimizing their multi-cloud strategy and continued investment in cloud-native technologies. They emphasized the importance of staying adaptable and continuously learning as the Kubernetes ecosystem evolves.
The session concluded with a Q&A, where attendees asked about specific technical challenges and how Yahoo overcame them. The speakers provided insights into their decision-making process and offered advice for organizations embarking on similar journeys.
The breakout session recording is available on the CNCF YouTube channel, and the presentation slides are accessible on the event’s webpage.