Highlight text as it’s being spoken using Amazon Polly
Blog
This article demonstrates how to highlight text in real-time as Amazon Polly speaks it aloud, enabling enhanced text-to-speech applications.
- Amazon Polly generates audio files and speech marks (JSON) showing word/sentence timing
- Speech marks contain timestamps in milliseconds for precise text synchronization
- Solution uses Lambda, API Gateway, S3, and CloudFront for serverless architecture
- Browser-side JavaScript parses speech marks and highlights text using timeouts
- Useful for audiobooks, educational content, accessibility, and dynamic animations
- Alternative approaches include Step Functions, asynchronous processing, or presigned URLs
- Complete CloudFormation template available on GitHub for easy deployment
This solution provides a practical way to synchronize visual text highlighting with audio playback, improving comprehension and accessibility for text-to-speech applications.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.