Cloudflare Inc. has acquired Replicate Inc., a startup with software that makes it easier to deploy artificial intelligence models in production.
The companies announced the transaction today without disclosing its financial terms. Replicate previously raised more than $23 million in funding from Y Combinator, Sequoia Capital and other backers.
Large language models depend on various auxiliary components to work. The list of required modules often includes CuDNN, an Nvidia Corp. library that provides LLM building blocks such as attention mechanisms. AI models usually also require an implementation of Python, the go-to language for writing AI workloads.
Individually setting up all the components an LLM requires can take hours. Software teams speed up the process by packaging an LLM and its dependencies into a container. When it’s time to deploy the model in production, developers can simply install the ready-to-use container instead of manually setting up its components.
San Francisco-based Replicate offers an AI catalog that includes containerized versions of more than 50,000 models. The company created the containers using Cog, an internally developed tool it open-sourced in 2019. Packaging AI models and their supporting components into a container speeds up the deployment process, but it can still be time-consuming. Cog automates much of the work involved in the task.
Replicate enables customers to deploy its containerized models on a managed cloud platform. The platform, which also supports custom LLMs, removes the need for developers to manage infrastructure. It’s billed based on usage.
Cloudflare will move Replicate’s platform to its infrastructure, a change that is expected to improve its reliability and performance. Additionally, the former company will use the technology that it’s obtaining through the acquisition to enhance its Workers AI service.
Similarly to Replicate, Cloudflare Workers enables developers to deploy software in the cloud without having to maintain the underlying hardware. That underlying hardware is scattered in data centers around the world. When a user sends a request to a Cloudflare Workers application, the platform uses the closest data center to process it, which lowers latency.
Workers AI is a version of the platform optimized for machine learning workloads. Cloudflare plans to expand the platform’s catalog of ready-to-use AI models using Replicate’s containerized AI library. Additionally, the company will introduce the ability to run custom LLMs and fine-tuned versions of open-source models.
The development effort will also see Cloudflare enhance its AI Gateway service. The service enables developers to cache an LLM’s responses to frequently recurring user prompts, which removes the need to generate those responses from scratch each time. AI Gateway doubles as an LLM observability tool.
“We will integrate our unified inference platform deeply with the AI Gateway, giving you a single control plane for observability, prompt management, A/B testing, and cost analytics across all your models, whether they’re running on Cloudflare, Replicate, or any other provider,” Cloudflare Vice President Rita Kozlov and Replicate Chief Executive Officer Ben Firshman wrote in a blog post.
Photo: Wikimedia Commons
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
- 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
- 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About News Media
Founded by tech visionaries John Furrier and Dave Vellante, News Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.
