- Lab
-
Libraries: If you want this lab, consider one of these libraries.
- Cloud
Build a Speech Summary Generator with Amazon Polly
Imagine turning your meeting notes into lifelike audio summaries that your team can listen to on the go. In this hands-on lab, you’ll learn how to bring that experience to life using Amazon Polly. You’ll start by preparing structured insights and setting up Amazon S3 to store your output files. Then, using the Polly console, you’ll generate natural-sounding speech by selecting voices, applying SSML for expressive delivery, and optionally using lexicons to fine-tune pronunciation. Finally, you’ll retrieve and listen to your synthesized audio file from S3. Whether you're building voice-enabled apps, audio briefings, or accessibility tools, this lab will give you the skills to create polished speech experiences with ease.
Lab Info
Table of Contents
-
Challenge
Prepare Structured Meeting Insights
You'll organize your output folders in Amazon S3, ensuring each synthesis job has a dedicated location for its audio output.
- Within the
polly-speech-generator-bucketS3 bucket, create anoutputfolder. - Inside the
output folder, create the following subfolders: -plain_text-ssml-lexicon
- Within the
-
Challenge
Convert Text into Realistic Speech
Learn how to customize voice selection and language options, and enhance expressiveness using Speech Synthesis Markup Language (SSML) and lexicons for pronunciation control. Create three synthesis jobs and configure them to output the MP3 files to the respective folders created in the previous objective:
- Plain text synthesis job → S3 key prefix:
output/plain_text/ - Lexicon synthesis job → S3 key prefix:
output/lexicon/ - SSML synthesis job → S3 key prefix:
output/ssml/
Nice to have: For the SSML synthesis job, try adding your own SSML tags to the starter file before synthesizing — experiment with
<break>and<prosody>tags to control pacing and emphasis. - Plain text synthesis job → S3 key prefix:
-
Challenge
Retrieve and Review the Synthesized Speech Summary from Amazon S3
After generating the audio files, you'll retrieve and play back the synthesized speech summaries from Amazon S3. You'll compare the plain text and lexicon outputs to hear how custom pronunciation rules affect the speech — and observe how SSML markup can enhance naturalness and expressiveness, even when the difference is subtle.
About the author
Real skill practice before real-world application
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Learn by doing
Engage hands-on with the tools and technologies you’re learning. You pick the skill, we provide the credentials and environment.
Follow your guide
All labs have detailed instructions and objectives, guiding you through the learning process and ensuring you understand every step.
Turn time into mastery
On average, you retain 75% more of your learning if you take time to practice. Hands-on labs set you up for success to make those skills stick.