OpenAI has released Open Responses, an open specification to standardize agentic AI workflows and reduce API fragmentation. Supported by partners like Hugging Face and Vercel and local inference providers, the spec introduces unified standards for agentic loops, reasoning visibility, and internal versus external tool execution. It aims to enable developers to easily switch between proprietary models and open-source models without rewriting integration code.
The specification formalizes concepts such as items, reasoning visibility, and tool execution models, enabling model providers to manage multi-step agentic workflows (repeating cycles of reasoning, tool invocation, and reflection) within their infrastructure. This shift enables model providers to process complex workflows within their infrastructure and return final results in a single API request. Additionally, native support for multimodal inputs, streaming events, and cross-provider tool calling reduces the translation work when switching between frontier models and open-source alternatives.
The core concepts introduced in the specification are items, tool use, and agentic loop. An item is an atomic unit representing model input, output, tool invocations, or reasoning states, with examples including message, function_call, and reasoning types. Items are extensible, allowing providers to emit custom types beyond the specification. One notable item type is reasoning, which exposes model thought processes in a provider-controlled manner. The payload can include raw reasoning content, protected content, or summaries, giving developers visibility into how models reach conclusions while allowing providers to control disclosure.
Open Responses distinguishes between internal and external tools to define where the orchestration logic resides. Internal tools are executed directly within the provider’s infrastructure, allowing the model to autonomously manage the agentic loop. In this scenario, model providers can perform tasks like searching documents and summarizing findings before returning the final result in a single API round-trip. Conversely, external tools are executed within the developer’s application code. In this model, the model provider pauses to request a tool call, requiring the developer to handle the execution and return the output to the model to continue the loop.
image source: openresponses.org
The specification has seen early adoption from partners including Hugging Face, OpenRouter, Vercel, and local inference providers like LM Studio, Ollama and vLLM, enabling standardized agentic workflows on local machines.
The announcement has prompted discussion regarding vendor lock-in and ecosystem maturity. Rituraj Pramanik noted:
Building an “open” standard on top of OpenAI’s API is slightly ironic, but practical. The real nightmare is fragmentation; we waste so much time gluing different schemas together. If this spec stops me from writing another “wrapper for a wrapper” and makes model swapping painless, you are solving the single biggest headache in agentic development.
Other developers view the move as a signal of growing maturity in the LLM landscape. AI developer and educator Sam Witteveen predicts:
Expect frontier open model labs (Qwen, Kimi, DeepSeek) to train models compatible with BOTH Open Responses AND the Anthropic API. Ollama just announced Anthropic API compatibility too, meaning high-quality local models running with the ability to use Claude Code tools is not too far away. This could be a huge win for developers wanting to switch between proprietary and open models without rewriting their stack
The Open Responses specification, schema, and compliance test tool are now available at the project’s official website, and Hugging Face has released a demo application for developers to see the spec in action.
