AzureTTSVoiceGeneratorGUI: An App to create Voice Messages using Azure Cognitive Service Text-to-Speech

Hi All,

one of the recurring activity of a Phone System Admin, for Microsoft Teams ot any other PBX, is the generation of Recorded Voice Messages.
There are three method to obtain these messages:

  1. Professional voice greeting providers: very high quality, high costs, high latency, no flexibility
  2. Non-Professional (home made) voice greetings: very poor quality, no costs, low/no latency, flexibility
  3. Text-to-Speech Voice Generator: poor quality, no costs, no latency, flexibility

I do not want to consider the option #2 (even if I’ve done it some times). The option #1 is the best in quality, but I’ve not seen it used many times. The TTS option needs only to improve the quality, the rest are perfect, so I start to wonder if I can use the power of Azure Cognitive Services to generate high quality Text-to-Speech voice messeges… and this App is the answer!

In this article you will find:
Introduction to Azure Cognitive Services
How to create a Free Test Account
How to create a Free Permanent Account in your Azure Tenant
How to use the App

Quick Download from GitHub

AzureTTSVoiceGeneratorGUI.ps1
Version 1.0
GitHub Repository: https://github.com/LucaVitali/AzureTTSVoiceGeneratorGUI 

Introduction to Azure Cognitive Services

Microsoft Azure Cognitive Services are a collection of API to help Bots, websites and programs to use natural method of communication.
https://docs.microsoft.com/en-us/azure/cognitive-services/

There are six different areas (vision, speech, knowledge, search, language and anomaly detection) where Azure Cognitive Services allows developers to build applications that walk, talk and interact with users like a real human being.

In this App I use Text-to-Speech Service REST API https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-text-to-speech to generate new voice messages. It’s very easy (even for a non-developer like me) and it’s free for normal needs.

Using Azure Cognitive Services requires an account in Azure, to authenticate to the various APIs and more.

How to create a Free Account

There are two different method to create free Azure Speech Services Account, the first create a Trial Account that works for 7 days, the second create a Free “Permanent” Account into your existing Azure Subscription. Let’s see how:

1. Microsoft Free Trial Account (7-days)

To get started using the Text-to-Speech REST API for free, go to Microsoft’s Try Cognitive Services page, then Speech APIs and then on Get API Key in the Speech Services row.

Select “Guest 7-day trial”

Accept the Agreement

Choose the account you want to use (in this example I use a GitHub Account)

You will get a welcome message like this one

And here we are, these are the informations we need to use the TTS Service and my App, take note of that.
Note that the free trial endpoint is always this one below.
Endpoint: https://westus.api.cognitive.microsoft.com/sts/v1.0
Key 1: <key1>
Key 2: <key2>

2. Azure Free Account (Permanent)

This second method require that you have an existing Azure Subscription.
To create the Free Speech Service Account you will need to use PowerShell and Az.CognitiveServices PowerShell module.

  1. Open PowerShell and run Install-Module Az to install Az.CognitiveServices PowerShell module via PowerShell Gallery
  2. Run Connect-AzAccount to Authenticate to Azure
  3. Run Get-AzCognitiveServices to check if any Azure Cognitive Services accounts exist in your Azure Subscription. We assume that this command will not return anything.
  4. Run Get-AzCognitiveServicesAccountSkus -Type SpeechServices -Location <Location> to get the list of different Accounts are available in your Location
    For example in Europe I’ll run
    Get-AzCognitiveServicesAccountSkus -Type SpeechServices -Location WestEurope
    Check if you have an Account like this (that is what we need)
    Name: F0
    Tier: Free
  5. Run this Command
    New-AzCognitiveServicesAccount -ResourceGroupName CognitiveServices-RG -Name CCCognitiveServicesAccount -Type SpeechServices -SkuName F0 -Location WestEuropeOpen PowerShell and run Install-Module Az to install Az.CognitiveServices PowerShell module via PowerShell Gallery
  6. Run Get-AzCognitiveServices to check if any Azure Cognitive Services accounts exist in your Azure Subscription. We assume that this command will not return anything.
  7. Run Get-AzCognitiveServicesAccountSkus -Type SpeechServices -Location <Location> to get the list of different Accounts are available in your Location
    For example in Europe I’ll run
    Get-AzCognitiveServicesAccountSkus -Type SpeechServices -Location WestEurope
    Check if you have an Account like this (that is what we need)
    Name: F0
    Tier: Free
  8. Run this Command to create the new Account
    New-AzCognitiveServicesAccount -ResourceGroupName <ResourceGroupName> -Name <Unique Account Name> -Type SpeechServices -SkuName F0 -Location <Location>
    For example
    New-AzCognitiveServicesAccount -ResourceGroupName CognitiveServices-RG -Name LucaVitaliCognitiveServicesAccount -Type SpeechServices -SkuName F0 -Location WestEurope
  9. Run this Command to get the two keys
    Get-AzCognitiveServicesAccountKey -ResourceGroupName <ResourceGroupName> -Name <Unique Account Name>
    For example:
    Get-AzCognitiveServicesAccountKey -ResourceGroupName CognitiveServices-RG -Name LucaVitaliCognitiveServicesAccount
    You will get the two Keys to be used in the App

As you can see, even if this is a Free Account, limits are very high for our purpose.

How to use the App

First of all, you can download the AzureTTSVoiceGeneratorGUI.ps1 from GutHub

AzureTTSVoiceGeneratorGUI.ps1
Version 1.0
GitHub Repository: https://github.com/LucaVitali/AzureTTSVoiceGeneratorGUI

Save the file in a local folder on your PC, then right-click on it and select “Run with PowerShell”

Choose the Location of your TTS Account from the drop down “Location” list.
Token Service and TTS Endpoint will change accordingly.
Remember: Temp Free Accounts are always in WestUS Location

Enter the Key from your Account you get above

Use Browse to choose the Output Folder or enter it directly.

Enter the name of the Output File you want to create, for example VoiceMessage.wav

Chose the Audio Format, I suggest riff-16khz-16bit-mono-pcm if you need to use the voice message on a PSTN connection with G.711 Codec

The Voice list is dynamically created from the Location you choose above (not every Location have all the Voices).
Prefer Neural Voices instead of Standard Voices.
As Microsoft Docs say, “Neural voices use deep neural networks to overcome the limits of traditional text-to-speech systems in matching the patterns of stress and intonation in spoken language, and in synthesizing the units of speech into a computer voice”.

Enter the message to be converted in the text box and hit RUN!
After few seconds you will have your Voice Message created.
Look at the PowerShell window to get any error.

Click Save Settings to export above data (Location, Key, Output and Audio) to an .xml file, that will be automatically read the next time you run the App. Very useful!

As always, I hope this could be helpful to some of you!
Best Regards.
Luca

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.

Up ↑

%d bloggers like this: