Home icon
Host the Whisper Model on Amazon SageMaker: exploring inference options

Machine Learning Blog



This article explains how to deploy the open-source Whisper automatic speech recognition (ASR) model on Amazon SageMaker using PyTorch and Hugging Face frameworks. It also explores different inference options available on SageMaker, including real-time inference, batch transform jobs, and asynchronous inference.

Specifically, the article covers:

  • Saving model artifacts for PyTorch and Hugging Face implementations of Whisper
  • Selecting appropriate deep learning containers (DLCs) for each framework
  • Creating SageMaker models for PyTorch and Hugging Face Whisper
  • Defining custom inference scripts for loading the models
  • Deploying the models using real-time inference, batch transform jobs, and asynchronous inference
  • Comparing the inference options in terms of speed, payload size, scalability, and cost
  • Providing guidance on choosing the appropriate inference option based on use case requirements


Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.