By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: OpenAI Introduces New Speech Models for Transcription and Voice Generation
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > OpenAI Introduces New Speech Models for Transcription and Voice Generation
News

OpenAI Introduces New Speech Models for Transcription and Voice Generation

News Room
Last updated: 2025/03/31 at 2:59 PM
News Room Published 31 March 2025
Share
SHARE

OpenAI has introduced new speech-to-text and text-to-speech models in its API, focusing on improving transcription accuracy and offering more control over AI-generated voices. These updates aim to enhance automated speech applications, making them more adaptable to different environments and use cases.

The new gpt-4o-transcribe and gpt-4o-mini-transcribe models improve word error rate (WER), outperforming previous versions, including Whisper v2 and v3. These models are designed to handle better accents, background noise, and variations in speech speed, making them more reliable in real-world scenarios such as customer support calls, meeting transcriptions, and multilingual conversations.


Source: OpenAI Blog

Training improvements, including reinforcement learning and exposure to a more diverse dataset, contribute to fewer transcription errors and better recognition of spoken language. These models are now available through the speech-to-text API.

The gpt-4o-mini-tts model introduces a new level of steerability, allowing developers to guide how the AI speaks. For example, users can specify that a response should sound like a sympathetic customer service agent or an engaging storyteller. This added flexibility makes it easier to tailor AI-generated speech to different contexts, including automated assistance, narration, and content creation.

While the voices remain synthetic, OpenAI has focused on maintaining consistency and quality to ensure they meet the needs of various applications.

Reactions to the new models have been positive. Harald Wagener, a head of project management at BusinessCoDe GmbH, highlighted the range of available voice options, saying:

Great playground to find the perfect style for your use case. And it sounds amazing, thanks for building and sharing!

Luke McPhail compared OpenAI’s models to other industry offerings, stating:

First impressions of OpenAI FM: It does not quite match AI audio leaders like ElevenLabs, but that might not matter. Its huge market share and easy-to-use API will make it appealing for developers.

Developers have also appreciated the models for their seamless integration and usability. Some noted that while OpenAI’s speech models may not yet surpass specialized audio solutions, their accessibility and well-structured API make them a practical choice for many applications.

These new speech-to-text and text-to-speech models are now available. Developers can integrate them into their applications using the Agents SDK, streamlining the process of adding voice capabilities.

OpenAI plans to further improve the intelligence and accuracy of its audio models while exploring ways for developers to create custom voices for more personalized applications. Ensuring these capabilities align with safety and ethical standards remains a priority.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Intel-Started Cloud Hypervisor Project Adds Experimental RISC-V Support
Next Article Prime video wants ‘Operation Triunfo’ to be a success again. And that’s why castings by Tiktok again
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Speed Up Site Builds With 1,500+ Copy-Paste Blocks, Now Just $79.99
News
Xiaomi’s second EV model draws styling cues from Ferrari Purosangue · TechNode
Computing
macOS Tahoe 26 release date, new features and all you need to know
Gadget
How to download iOS 26
News

You Might also Like

News

Speed Up Site Builds With 1,500+ Copy-Paste Blocks, Now Just $79.99

3 Min Read
News

How to download iOS 26

7 Min Read
News

Apple’s Liquid Glass was a wild change to my iPhone

5 Min Read
News

London AI firm says Getty copyright case poses ‘overt threat’ to industry

5 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?