In collaboration with NVIDIA, Microsoft has announced the integration of NVIDIA NIM microservices and the NVIDIA AgentIQ toolkit into Azure AI Foundry. This strategic move aims to significantly accelerate the development, deployment, and optimization of enterprise-grade AI agent applications, promising streamlined workflows, enhanced performance, and reduced infrastructure costs for developers.
The integration directly addresses the often lengthy enterprise AI project lifecycles, extending from nine to twelve months. By providing a more efficient and integrated development pipeline within Azure AI Foundry, leveraging NVIDIA’s accelerated computing and AI software, the goal is to enable faster time-to-market without compromising the sophistication or performance of AI solutions.
NVIDIA NIM (NVIDIA Inference Microservices), a key component of the NVIDIA AI Enterprise software suite, offers a collection of containerized microservices engineered for high-performance AI inferencing. Built upon robust technologies such as NVIDIA Triton Inference Server, TensorRT, TensorRT-LLM, and PyTorch, NIM microservices provide developers with zero-configuration deployment, seamless integration with the Azure ecosystem (including Azure AI Agent Service and Semantic Kernel), enterprise-grade reliability backed by NVIDIA AI Enterprise support, and the ability to tap into Azure’s NVIDIA-accelerated infrastructure for demanding workloads. Developers can readily deploy optimized models, including Llama-3-70B-NIM, directly from the Azure AI Foundry model catalog with just a few clicks, simplifying the initial setup and deployment phase.
Once NVIDIA NIM microservices are deployed, NVIDIA AgentIQ, an open-source toolkit, takes center stage in optimizing AI agent performance. AgentIQ is designed to seamlessly connect, profile, and fine-tune teams of AI agents, enabling systems to operate at peak efficiency.
Daron Yondem tweeted on X:
NVIDIA’s AgentIQ treats agents, tools, and workflows as simple function calls, aiming for true composability: build once, reuse everywhere.
The toolkit leverages real-time telemetry to analyze AI agent placement, dynamically adjusting resources to reduce latency and compute overhead. Furthermore, AgentIQ continuously collects and analyzes metadata—such as predicted output tokens per call, estimated time to following inference, and expected token lengths—to dynamically enhance agent performance and responsiveness. The direct integration with Azure AI Foundry Agent Service and Semantic Kernel further empowers developers to build agents with enhanced semantic reasoning and task execution capabilities, leading to more accurate and efficient agentic workflows.
(Source: Dev Blog post)
Drew McCombs, vice president of cloud and analytics at Epic, highlighted the practical benefits of this integration in an AI and Machine Learning blog post, stating:
The launch of NVIDIA NIM microservices in Azure AI Foundry offers Epic a secure and efficient way to deploy open-source generative AI models that improve patient care.
In addition, Guy Fighel, a VP and GM at New Relic, posted on LinkedIn:
NVIDIA #AgentIQ will likely become a leading strategy for enterprises adopting agentic development. Its ease of use, open-source nature, and optimization for NVIDIA hardware provide a competitive advantage by reducing development complexity, optimizing performance on NVIDIA GPUs, and integrating with cloud platforms like Microsoft Azure AI Foundry for scalability.
Microsoft has also announced the upcoming integration of NVIDIA Llama Nemotron Reason, a powerful AI model family designed for advanced reasoning in coding, complex math, and scientific problem-solving. Nemotron’s ability to understand user intent and seamlessly call tools promises to enhance further the capabilities of AI agents built on the Azure AI Foundry platform.