Microsoft continues to evolve its Azure AI Foundry Agent Service and recently added the public preview of Deep Research, which allows users to conduct in-depth, multi-step research using public web data.
Deep Research is a specialized AI capability that moves beyond simple information retrieval, enabling AI agents to autonomously analyze, synthesize, and report on complex information. It is specifically engineered to support knowledge workers in demanding fields such as science, finance, and policy, where rigor, comprehensive documentation, and traceability are essential.
Azure AI Foundry Agent Service is a key part of Azure AI Foundry. This platform combines models, tools, frameworks, and governance into a unified system for building intelligent agents, where Agent Service enables the operation of agents across development, deployment, and production. The company heavily invests in Foundry and Agent Service, such as the recent addition of Model Context Support (MCP), Multi-Agent orchestration and observability.
Now, with the preview of Deep Research, Yina Arenas, VP at Microsoft, stated in an AI and Machine Learning blog post:
With Deep Research, developers can build agents that deeply plan, analyze, and synthesize information from across the web—automate complex research tasks, generate transparent, auditable outputs, and seamlessly compose multi-step workflows with other tools and agents in Azure AI Foundry.
The Deep Research capability operates through a multi-stage agent pipeline designed to mimic rigorous human research workflows:
- Intent Clarification, which utilizes advanced GPT-series models, including GPT-4o and GPT-4.1. The agent will clarify the initial research query and precisely scope the task.
- Web Data Discovery, where the agent securely invokes the Grounding with Bing Search tool to gather high-quality, recent web data, significantly mitigating hallucination risks and ensuring factual accuracy. In addition, it’s important to note that data transferred via Bing Search operates outside the Azure compliance boundary.
- Deep Analytical Execution, with the core o3-deep-research model, fine-tuned on the Azure OpenAI o3 reasoning model, that then executes the research. This model boasts a substantial 200,000-token context length and supports up to 100,000 completion tokens, enabling it to process vast amounts of information, reason step-by-step, and dynamically adjust its approach as new insights emerge.
- Report Generation and Traceability culminating in a structured, source-cited report that not only provides the final answer, but also meticulously documents the model’s step-by-step analytical path, a complete list of all utilized citations, and any clarifications sought during the session.
(Source: Microsoft Learn documentation)
When Deep Research was introduced months ago, early user feedback in a Hacker News thread suggests that human verification remains essential. Users have reported instances where the tool generated factual errors, such as misinterpreting data from cited sources or misattributing content, even when links to sources were provided. Hence, while Deep Research offers a powerful springboard for research, its outputs should be treated with caution and thoroughly fact-checked by human users to ensure accuracy.
Lastly, the pricing for the o3-deep-research model starts at $10 per million input tokens and $40 per million output tokens, with additional charges for Bing Grounding and the GPT clarification stage. Developers interested in leveraging this capability can sign up for the limited public preview and explore the documentation and learning modules provided by Microsoft.