Datatalk / TTS.
This is a step-by-step instruction of how to set up Amazon Polly.
Unfortunately, it doesn't solve my real problem: I am looking for a way to use Polly voices with a TTS reader, such as Balabolka. There are some problems...
Well, at least I managed to get Amazon running, in spite of their poor online instructions.
0. Warnings
1. You need a credit card
2. The service isn't free
3. Amazon's instructions never match their screens, so pay attention
4. You need admin rights on your machine
1. Set up an Amazon account.
This page explains how to use Amazon Polly in Windows Applications so let's try...
(Note that this process keeps changing all the time, so pay attention to your screen and the instructions!)
1.1 Create an AWS account using this link.
1.2 Enter an email address and user name (you can use your email address again, or use something else)
1.3 Amazon will send you a verification email to your email address, containing a validation code.
1.4 Enter the validation code.
1.5 Create 'root user' password, this is the highest level of user, will handle billing etc.
1.6 Give them lots of your data
1.7 A credit card is required!
1.8 The system will ask you to pay $1 for verification
1.9 Answer a captcha code
1.10 It will then call you on a phone, or send you a text message
1.11 Pick the support plan (free)
You now have created an AWS account.
2. Create an 'IAM' user
An 'IAM user' is a person or a machine (software client) that can make API calls to AWS products. We're going to install the Amazon voice in Windows later.
2.1 Go to aws.amazon.com and sign in as the root user
The captcha is really horrible and hardly legible. I know Amazon is trying to keep out bots, but this is a little too much.
2.2 Type in the search bar 'IAM' (there doesn't seem to be an icon to click, weirdly enough)
2.3 When you go into the IAM dashboard the system will ask you to add MFA for the root user. Do so if you want to.
After adding an MFA go up one level and do a refresh, otherwise it will still show you as not having an active MFA.
2.4 Go back to the top level of IAM (the 'IAM Dashboard' screen)
2.5 Users / Add User
2.6 User name: polly-windows-user
I copied this from this page, I'm not sure if you can use any other name, though I suspect you can.
Now the screen doesn't match the instructions, so I'm NOT going to provide the user access to the console. There is no option 'Programmatic access' so I simply click...
2.7 Next
2.8 On the 'Set Permissions' screen click on 'Attach Policies Directly'
2.9 In the search box enter 'polly'
2.10 AmazonPollyReadOnlyAccess, then 'Next'
2.11 A 'Review and Create' screen comes up. Select 'Create User'
2.12 The 'AIM > Users' screen pops up.
Again, everything looks different than instructed. I think :-) you have to continue like this...
2.13 Click on 'polly-windows-user'
2.14 Then 'Security Credentials'
2.15 Then 'Create Access Key'
Again, things are a little unclear. I think it's either 'Application Running Outside' or 'Other'. I picked...
2.16 'Other', Next
Access keys are generated. I don't think you have to keep a copy. You can always delete them and then re-create them in case they are lost.
3. Install the AWS CLI client
3.1 Install the client from here.
3.2 Download and install the 64 bit .MSI installer for Windows
To check if it works properly:
3.3 Open CMD, then enter:
c:\> aws --version
It should report something like 'aws-cli/1.27.98 Python/3.8.10 Windows/10 botocore/1.29.98'
4. Create a profile for the AWS CLI client
4.1 CMD, then
c:\> aws configure --profile polly-windows
4.2 Enter the Access and the Secret Access Key
4.3 Enter the region, I used 'eu-central-1'
You can find the region list on this page.
4.4 Default output format: just hit Enter
The configuration is done now.
4.5 To check if it works and to see what voices are available, enter:
c:\> aws --profile polly-windows polly describe-voices
5. Windows Plugin
5.1 Download the Windows Plugin (from here or here)
5.2 Install the thing
5.3 Agree to everything
Now comes the hard part... finding a reader that works properly...
6. Command line
Only 64 bit applications can see 64 bit voices, and you guessed it, Amazon is a 64 bit voice, and my favorite app is 32 bit (Balabolka). In other words: no luck.
But... did it work? Of course it did :-)
Simple text phrases
6.1 From the command line I test the workings of AWS with this:
c:\> aws polly synthesize-speech ^ --output-format mp3 ^ --engine neural ^ --voice-id Ruth ^ --text "'Well,' she said, 'that is quite the surprise. Now tell me, how do we continue from here?'" ^ hello.mp3
6.2 Then played it back using:
c:\> start hello.mp3
The created file is 24 khz 16 bits.
Mission accomplished! Except that it doesn't work with Balabolka...
7. Goal and issues
So, I'm looking for some kind of TTS program that could use Amazon for short sequences. Balabolka works fine, and can do Amazon Polly text to audio on full files, but I'd like to be able to use Amazon Polly when copying text to the clipboard, and I haven't found a decent solution for that yet.
Here are the problems:
1. There are 32 and 64 bit voices / (S)API interfaces (and I'm running 32 bits Balabolka on a 64 bits PC)
2. The new Windows 11 voices use a completely different system, yet again
3. Amazon Polly and Google Wavelet (?) are different, yet again...
Theoretically, it should be possible to replace the default voice in Windows with something else, but I can't find a way in Windows 11 (and probably not on Windows 10 either).
So, still a challenge...
Will be continued!
Update. I did create a kind of interim solution using Immprep...
No comments:
Post a Comment