Amazon SageMaker AI Inference now supports bidirectional streaming

News

This article announces bidirectional streaming support in Amazon SageMaker AI Inference for real-time speech-to-text transcription.

Enables continuous speech processing with minimal latency instead of batch input
Models receive audio streams and return partial transcripts simultaneously as users speak
Eliminates need for custom WebSocket implementations and streaming protocol management
Uses HTTP2 connection with automatic WebSocket creation to SageMaker containers
Compatible with existing speech models like Deepgram without modifications
Available across 32+ AWS regions globally

SageMaker AI Inference's bidirectional streaming simplifies voice agent development by providing managed infrastructure for real-time transcription, reducing deployment time from months to immediate availability.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Nov 25
2025

Introducing bidirectional streaming for real-time inference on Amazon SageMaker AI

Jun 17
2026

Amazon SageMaker AI Async Inference now supports inline request payloads

Dec 6
2024

Amazon SageMaker introduces new capabilities to accelerate scaling of Generative AI Inference

Apr 22
2026

Amazon SageMaker AI now supports optimized generative AI inference recommendations

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Amazon SageMaker AI Inference now supports bidirectional streaming

Related articles