By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Openai’s new audio models in api can be used to build speaking ai agents
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Software > Openai’s new audio models in api can be used to build speaking ai agents
Software

Openai’s new audio models in api can be used to build speaking ai agents

News Room
Last updated: 2025/03/21 at 9:42 AM
News Room Published 21 March 2025
Share
SHARE

Openai, on Thursday, introduced new audio models in application programming interface (api) that offer improved performance in accuracy and reliability. The San Francisco-Based AI Firm Released Three New Artificial Intelligence (AI) Models for Both Speech-to-text Transcription and Text-to-Speech (TTS) Functions. The company claimed that these models will enable developers to build applications with agentic workflows. It also also stated that API can enable businesses to automate customer support-Like operations. Notably, the new models are based on the company’s GPT-4o and GPT-4o Mini Ai Models.

Openai brings new audio models in api

In a blog post, the AI ​​firm detailed the new api-specific ai models. The company highlighted that over the Years it has released Several Ai agents such as operator, Deep Research, Computer-RESING AGENTS, and the Responses API with Built -in Tools. However, it added that the true potential of agents can only be unlocked when they can perform intuitively and interactive access media mediums beyond text.

There are three new audio models. GPT-4o-Transcribe and GPT-4O-Min-Transcribe are the speech-to-text models and the GPT-4o-Mini-Tts is, as the name sugges, a tts model. Openai Claims that these models outperform its existing whisper models which was released in 2022. However, Unlike the older models, the new ons are not open-resource.

Coming to the GPT-4O-TRANSCRIBE, The AI ​​Firm Stated that It Showcases Improved “Word error Rate” (Wer) Performance on the FEW-Shot Learning evaluation of universal represencies of SPECH Benchmark which tests ai models on multilingual speech across 100 languages. Openai said the improvements were a result of targeted training techniques

These speech-to-text models can capture audio even in challenging Scenarios such as heavy accents, noisy environments, and varying speech speeds.

The GPT-4O-Mini-Tts Model also come with significant improvements. The AI ​​Firm Claims that the models can speake with customisable inflections, interactions, and emotional Expressiveness. This will enable developers to build applications that can be used for a wide range of tasks including customer service and creative storytelling. Notable, the model only offers artificial and preset voices.

Openai’s api pricing page highlights that the GPT-4o-Based Audio Model will cost $ 40 (roughly Rs. On the other hand, the GPT-4o Mini-Based Audio Models will be charged at the rate of 10 (roughly Rs. 860) per million in input tokens and $ 20 (roughly Rs.

All of the audio models are available to developers via api. Openai is also related an integration with its agents software development kit (SDK) to help users build voice agents.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Pump Up the Savings on Speakers and Soundbars at Amazon’s Early Spring Sale
Next Article Preparing for a Busy Sales Season? Here’s How to Not Mess Up the Customer Experience | HackerNoon
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

How Layer-2 Tunnels and Meta-Transactions Lock Down Blockchains | HackerNoon
Computing
Spotify’s Daniel Ek just bet bigger on Helsing, Europe’s defense tech darling | News
News
There may be a ‘third state’ between life and death
News
Danish military using robotic sailboats for surveillance in Baltic and North seas
News

You Might also Like

Software

IQoo Z10 Lite 5G: From Launch Date to Features, Everything We Know So Far

5 Min Read

Grizzlies trade Desmond Bane to Magic for Caldwell-Pope, Anthony, picks

4 Min Read
Software

Get an ai investment coach for just a $ 84 for life

3 Min Read
Software

Apple Risks Fresh Eu Charge Sheet Over App Store Curbs

3 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?