How To Build A Multilingual Text-to-Audio Converter With Python

“To have another language is to possess a second soul.”
— Charlemagne

Imagine you are traveling to a new country and had the ability to seamless have a conversation in their local language. That is what we will be trying to achieve in this article by building a simple text-to-audio converter app using Python, googletrans API and gTTS for text-to-speech conversion. We will go over the complete code, how the different components work, and how to leverage the different APIs to accomplish different tasks like converting text from English to any language and then converting it to audio in that specific language

The different components

The are three sections to this

Translation – googletrans the Python library which uses Google Translation to help with language translation
Text-to-speech – gTTS (Google Text-to-Speech) which will help convert text to audio format in the language of our choice
Audio playback – pygame which is primarily used for developing games, but we will be using it here to playback the audio that’s generated by gTTS

Prerequisites

We can use pip command in terminal to install the needed libraries:

pip install gTTS googletrans==4.0.0-rc1 pygame

Note: Sometimes you might encounter the below error when running the actual Python code –

AttributeError: 'coroutine' object has no attribute 'text'
sys:1: RuntimeWarning: coroutine 'Translator.translate' was never awaited

Fix – Make sure you have the correct version of googletrans installed. The version 4.0.0-rc1 is known to work well for synchronous operations.

Implementation

translate_text

The translate_text function uses the googletrans for text translation. It takes two parameters: text, the actual string that needs to be translated, and dest_language the target language code (e.g., 'es' for Spanish). Inside the function, we create a Translator object and call the translate method which returns the translated text.

text_to_audio

The text_to_audio function helps convert the text to audio using gTTS and pygame. It takes two parameters: text and language, this would be the same as the dest_language input as we want the audio to be in the same language as the one it’s translated to. The function creates an audio file using gTTS and stores it as an MP3 file. Then we initialize pygame.mixer to handle audio playback, load the MP3, and then play it. We have a loop to ensure the audio fully finishes playing after which we can clean up the audio file if needed by setting should_clean_up_file to True

Below is the complete code –

from gtts import gTTS
from googletrans import Translator
import pygame
import os

def translate_text(text, dest_language):
    translator = Translator()
    translation = translator.translate(text, dest=dest_language)
    return translation.text

def text_to_audio(text, language):
    mp3_file = f'{language}_output.mp3'
    should_clean_up_file = True
    try:
        tts_file = gTTS(text=text, lang=language, slow=False)
        tts_file.save(mp3_file)
        pygame.mixer.init()
        pygame.mixer.music.load(mp3_file)
        pygame.mixer.music.play()
        while pygame.mixer.music.get_busy():
            pygame.time.Clock().tick(15)
    finally:
  
        if os.path.exists(mp3_file) and should_clean_up_file:
            os.remove(mp3_file)


def main(english_text, target_language='en'):

    translated_text = translate_text(english_text, target_language)
    print(f"English Text: {english_text}")
    print(f"Translated Text: {translated_text}")

    text_to_audio(translated_text, target_language)


if __name__ == "__main__":
    english_text = "Hello, welcome to the world of text-to-speech conversion using Python."
    target_language = 'es'  # Spanish
    main(english_text, target_language)

Input1 – English to Spanish:

english_text = "Hello, welcome to the world of text-to-speech conversion using Python."
target_language = 'es'  # Spanish
main(english_text, target_language)

Output:

English to Spanish translation

Audio output:

Spanish Audio file

This would have created an es_output.mp3 in your current folder which would be played by pygame

Input2 – English to Japanese:

english_text = "Hello, welcome to the world of text-to-speech conversion using Python."
target_language = 'ja'  # Japanese
main(english_text, target_language)

Output:

English to Japanese translation

Audio output:

Japanese Audio file

This would have created an ja_output.mp3 in your current folder which would be played by pygame

Applications and Use Cases

Accessibility – This can be easily integrated into a Tourism app or a website which can greatly help people who want to explore a foreign country where they don’t speak the native language, to travel with confidence
Language Learning – If someone is interested in learning a new language, we can leverage this tool to self-teach. We simply input the text we want translated and we get the converted text along with audio which can also help with pronunciation
Content Consumption – For people who want to multi-task, say listening to an audiobook while driving, this tool would be handy as it can read out the contents in a pace that you prefer
Multilingual Communication – In today’s world where multinational deals are common, having the power to articulate your thoughts, and business proposals to anyone in any language is a powerful asset that can make or break deals

Conclusion

There isn’t a space that can’t be benefited by this application. It’s simple to build but its benefits are vast. By developing this tool we not only have solved a real-world problem that many people face but have also learnt how we can use Python to make API calls,

initialize objects, invoke methods, functional programming, and try catch and clean up files after its use. Once you have mastered these and want a challenge you can try building an interactive GUI and host it in a web server to make it more user-friendly and add features like – the option to change pronunciation, pace, etc. The possibilities are endless and hope you keep pushing the boundaries of how we can use technology/coding to advance humankind.

How To Build a Multilingual Text-to-Audio Converter With Python | HackerNoon

The different components

Prerequisites

Implementation

translate_text

text_to_audio

Input1 – English to Spanish:

Output:

Audio output:

Input2 – English to Japanese:

Output:

Audio output:

Applications and Use Cases

Conclusion

Leave a Reply Cancel reply

Stay Connected

Latest News

watch out for this tempting new scam

https://news.google.com/read/CBMijwFBVV95cUxQZm5vYUhfOFdqRWZEQTVIdmkxYlRGU2kzNXk1d1oxNHVXcUhEV2xUZ0xSWWc0YkNQQkxqVmk5a294c0lsUFlraW1RNUZwRUk0dHJJdWY4ZHlPYVpveW54R1ZzbmV1MVBkTDlOMmFSaFAwVWhJeVpEc1BMVFQyTXJTa1plNGxRNUFtanBVNUhWdw?hl=en-GB&gl=GB&ceid=GB%3Aen

How DeepSee Helped Ocean Researchers Make Smarter Decisions During Their Expedition | HackerNoon

Your Next Bed May Be HSA- or FSA-Eligible. Here's How

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

The different components

Prerequisites

Implementation

translate_text

text_to_audio

Input1 – English to Spanish:

Output:

Audio output:

Input2 – English to Japanese:

Output:

Audio output:

Applications and Use Cases

Conclusion

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News