By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: OpenAI’s gpt-realtime Enables Production-Ready Voice Agents with End-to-End Speech Processing
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > OpenAI’s gpt-realtime Enables Production-Ready Voice Agents with End-to-End Speech Processing
News

OpenAI’s gpt-realtime Enables Production-Ready Voice Agents with End-to-End Speech Processing

News Room
Last updated: 2025/09/11 at 5:30 AM
News Room Published 11 September 2025
Share
SHARE

OpenAI has released gpt-realtime, its most advanced speech-to-speech model, alongside the general availability of the Realtime API. The updates aim to reduce latency, improve speech quality, and give developers stronger tools, such as MCP server support, image input, and Session Initiation Protocol (SIP) phone calling support, for building production-ready AI voice agents.

The combined Realtime API and gpt-realtime is designed to handle end-to-end speech processing within a single system, rather than chaining together separate speech-to-text and text-to-speech models. This architecture cuts response times while preserving nuance in delivery, a critical improvement for real-time agents where even small delays can break conversational flow.

The gpt-realtime was trained to produce higher-quality speech with more natural pacing and intonation, and to respond reliably to style instructions such as “speak empathetically” or “use a professional tone.” Two new synthetic voices, Cedar and Marin, are available, and existing voices have been updated for greater realism.

On comprehension benchmarks, gpt-realtime shows measurable improvements. It can track non-verbal cues, switch languages within a single sentence, and more accurately process alphanumeric sequences (such as phone numbers, VINs, etc) across languages, including Spanish, Chinese, Japanese, and French. Internal testing highlights this jump, with gpt-realtime reaching 82.8% accuracy on Big Bench Audio compared to 65.6% for the previous model. Instruction-following is also sharper, with MultiChallenge audio benchmark scores rising from 20.6% to 30.5%.

Function calling is another area of focus. The model now performs better at identifying relevant functions, calling them at the right time, and supplying the correct arguments. On ComplexFuncBench, accuracy rose to 66.5% from 49.7%. There were updates to asynchronous function calling, allowing the voice agent to continue the conversation while waiting for results, a feature with obvious value for customer support and transactional applications.

The Realtime API has been upgraded to align with production requirements. Developers can now connect remote MCP servers directly into a session, enabling tool calls without manual integration work. Image input is supported, allowing applications to ground conversations in visual context, such as screenshots or photos. SIP support makes it possible to integrate voice agents with existing telephony systems, including PBXs and desk phones. Reusable prompts simplify session management, while full EU data residency support addresses compliance concerns for European deployments.

According to the release notes, early enterprise partners are testing these capabilities in production-like scenarios. Zillow is piloting voice-driven home search, while T-Mobile is exploring customer service use cases where real-time adaptability is essential. Both companies highlight the shift from scripted automation to more flexible, domain-specific expertise delivered through AI agents.

OpenAI has also reinforced safeguards around deployment. The Realtime API incorporates classifiers that can terminate harmful conversations, and developers can add domain-specific guardrails via the Agents SDK. Preset voices in Realtime API are used to reduce impersonation risks.

Both gpt-realtime model and Realtime API are immediately available to all developers. To get started, developers can visit the Realtime API documentation and prompting guide, and test the new gpt-realtime demo in the Playground.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Internet entrepreneur Kim Dotcom’s latest legal bid to halt deportation from New Zealand is rejected
Next Article Three major Chinese mobile operators adopt eSIM for iPhone Air · TechNode
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

AMD EPYC 9575F CPUs For GPU/AI Servers Show Leading Performance In Benchmarks Review
Computing
Apple’s iPhone 16 is now only available with 128GB of storage
News
Microsoft 365 Copilot bundles sales, service, and finance Copilots in October
News
AgiBot unveils Lingxi X2, an advanced humanoid robot with multimodal intelligence · TechNode
Computing

You Might also Like

News

Apple’s iPhone 16 is now only available with 128GB of storage

2 Min Read
News

Microsoft 365 Copilot bundles sales, service, and finance Copilots in October

2 Min Read
News

iPhone 17 Pro Camera Upgrades: What’s Changed This Year – BGR

4 Min Read
News

Apple exec likes that customers will struggle to choose between iPhone Air & Pro

1 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?