Implementing Text to Speech Translation

Learn how to perform the text-to-speech conversion using the Azure Speech SDK for Python.

Introduction

In this lesson, we’re going to explore the text to speech conversion using the Azure speech service. Text-to-Speech—also referred to as TTS—helps to generate real-time audio transcriptions from text data. We can provide the text data in a file format, or just a normal sentence to the Azure speech service and it will generate audio data.

The model that is being used behind the scenes to convert the text into speech is the one Microsoft themselves is using in its Office products and Cortana.

Different audio types and languages

We can generate speeches from the text in more than 270 neural voices and in more than 110 languages and their variants. You can visit the Azure Demo provided by Microsoft to try out different voices and get the best voice that suits your needs. Below is a list of different voices available for the English language.

Get hands-on with 1200+ tech skills courses.