Akamai AI Grid Intelligent Orchestration, For Distributed Inference At The Edge

Akamai Technologies has presented his NVIDIA AI Grid reference design implementationwith the intention of Take the industry from siled AI factories to a unified, distributed network for AI inference. To do this, Akamai has integrated NVIDIA AI infrastructure into Akamai infrastructure and leveraged intelligent workload orchestration across its network.

The move is a step in the evolution of Akamai Inference Cloud, which is deploying thousands of NVIDIA RTX Pro 6000 Blackwell Server Edition GPUs, resulting in a platform that enables enterprises to run physical and agential AI with the responsiveness of local computing and the scale of the global web. Preview (opens in a new tab)

At the core of AI Grid is an intelligent coordinator that acts as a real-time intermediary for AI requests. With Akamai’s expertise in optimizing the performance of AI applications, this workload-aware control plane optimizes tokenomics by improving cost per token, time to first token, and performance.

Based on NVIDIA AI Enterprise and powered by NVIDIA Blackwell architecture and NVIDIA BlueField DPUs for hardware-accelerated networking and security, Akamai can manage SLAs at edge and core locations. At the edge, with more than 4,400 locations, it offers fast response times for physical AI and autonomous agents, and will leverage semantic caching and serverless capabilities, such as Akamai Functions and EdgeWorkers, to deliver model affinity and stable performance at the user touchpoint.

With Akamai Cloud IaaS and dedicated GPU clusters, the public cloud core infrastructure enables portability and cost savings for large-scale workloads, while pods equipped with NVIDIA RTX Pro 6000 blackwell GPUs enable heavy-duty post-training and multi-modal inference.

The first wave of AI infrastructure was characterized by large clusters of GPUs in a few centralized locations optimized for training. But as inference becomes the dominant workload, and companies across industries focus on building AI agents, that centralized model faces the same scaling limitations that previous generations of Internet infrastructure encountered with media delivery, online gaming, financial transactions, and complex microservices applications.

Akamai is solving each of these challenges with the same basic approach: distributed networks, intelligent orchestration, and systems specifically designed to bring content and context together as close to the digital touchpoint as possible. The result has been an improvement in the user experience and a greater ROI for companies that have adopted the model. Akamai Inference Cloud applies that same proven architecture to AI factories, enabling the next wave of scale and growth by delivering dense computing from the core to the edge.

Adam Karon, COO and CEO of Akamai’s Cloud Technology Grouphas stressed that «las AI factories have been specifically designed for cutting-edge model training and workloads, and centralized infrastructure will continue to offer the best tokenomics for those use cases. But real-time video, physical AI, and highly concurrent personalized experiences demand inference at the point of touch, not a round trip to a centralized cluster. Our intelligent AI Grid orchestration gives AI factories a way to scale inference outward, leveraging the same distributed architecture that revolutionized content delivery to route AI workloads across 4,400 locations, at the right cost and at the right time.«.

Chris Penrose, Global Vice President of Business Development and Telco at NVIDIAhas commented that «las New native AI applications demand predictable latency and greater profitability on a global scale. By launching NVIDIA AI Grid, Akamai is creating the connective tissue for generative, agential and physical AI, moving intelligence directly into data to unlock the next wave of real-time applications.«.