Mastering Artificial Intelligence with Azure Cognitive Services/

...

Implementing Text to Speech Translation

Learn how to perform the text-to-speech conversion using the Azure Speech SDK for Python.

We'll cover the following...

Introduction
Different audio types and languages
Implementation

Sending a normal sentence for audio conversion
Generating audio for a text file

Introduction

In this lesson, we’re going to explore the text to speech conversion using the Azure speech service. Text-to-Speech—also referred to as TTS—helps to generate real-time audio transcriptions from text data. We can provide the text data in a file format, or just a normal sentence to the Azure speech service and it will generate audio data.

The model that is being used behind the scenes to convert the text into speech is the one Microsoft themselves is using in its Office products and Cortana.

Different audio types and languages

We can generate speeches from the text in more than 270 neural voices and in more than 110 languages and their variants. You can visit the Azure Demo provided by Microsoft to try out different voices and get the best voice that suits your needs. Below is a list of different voices available for the English language.

Different english voices available in Azure speech

Language	Locale	Voice name	Details
English (Australia)	`en-AU`	`en-AU-NatashaNeural`	It is a neutral female voice.
English (Australia)	`en-AU`	`en-AU-WilliamNeural`	It is a neutral male voice.
English (India)	`en-IN`	`en-IN-NeerjaNeural`	It is a neutral female voice.
English (India)	`en-IN`	`en-IN-PrabhatNeural`	It is a neutral male voice.
English (United Kingdom)	`en-GB`	`en-GB-LibbyNeural`	It is a neutral female voice.
English (United Kingdom)	`en-GB`	`en-GB-RyanNeural`	It is a neutral male voice.
English (United States)	`en-US`	`en-US-AshleyNeural`	It is a neutral female voice.
English (United States)	`en-US`	`en-US-AmberNeural`	It is a neutral female voice.
English (United States)	`en-US`	`en-US-AnaNeural`	It is a neutral kid voice.
English (United States)	`en-US`	`en-US-ChristopherNeural`	It is a neutral male voice.
English (United States)	`en-US`	`en-US-JacobNeural`	It is a neutral male voice.

Overview of the Course

Refresher to FastAPI - Python Web Framework

Introduction to Cloud and Microsoft Azure

Azure Vision Cognitive Services: Computer Vision

Azure Vision Cognitive Services: Custom Vision

Azure Vision Cognitive Services: Face API

Capstone Project 1: Building a Face Mask Classifier

Azure Vision Cognitive Services: Assessment

Azure Language Cognitive Services: LUIS

Capstone Project 2: Building a Weather Application Using LUIS

Azure Language Cognitive Services: QnA Maker

Capstone Project 3: Building a Chatbot Using Azure QnA Maker

Azure Language Cognitive Services: Text Analytics

Azure Language Cognitive Services: Translator

Azure Language Cognitive Services: Assessment

Azure Decision Cognitive Services: Anomaly Detection

Azure Decision Cognitive Services: Content Moderator

Azure Decision Cognitive Services: Personalizer

Azure Decision Cognitive Services: Assessment

Azure Speech Cognitive Services

Azure Bing Search Services

Azure Speech and Bing Search Services: Assessment

Conclusion

Appendix

Implementing Text to Speech Translation

Introduction

Different audio types and languages

Different english voices available in Azure speech