Whisper audio transcription powered by AWS Batch and AWS Inferentia

HPC Blog

This article discusses a cost-efficient solution for batch audio transcription using the Whisper model from OpenAI on AWS. It leverages AWS Batch and AWS Inferentia to process audio files asynchronously, triggered by the arrival of audio files in Amazon S3.

Specifically, the article covers:

Overview of AWS Batch and AWS Inferentia
Architecture of the event-driven audio transcription pipeline
Prerequisites and steps to build the Docker image, export Whisper model artifacts, and deploy the solution using AWS CloudFormation
Details on the inference script to process audio files using Whisper on Inferentia
Testing and validation of the solution using NASA audio files
Monitoring and metrics for the AWS Batch compute environment and Inferentia utilization
Conclusion highlighting the cost-optimization benefits of this batch processing solution

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

May 28
2025

Enhanced Performance for Whisper Audio Transcription on AWS Batch and AWS Inferentia

Sep 30
2024

Reducing transcription costs by 60% using AWS AI/ML services

Nov 6
2024

Unearth insights from audio transcripts generated by Amazon Transcribe using Amazon Bedrock

Apr 22
2026

Cost-effective multilingual audio transcription at scale with Parakeet-TDT and AWS Batch

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Whisper audio transcription powered by AWS Batch and AWS Inferentia

Related articles