Amazon SageMaker AI Inference now supports bidirectional streaming
News
This article announces bidirectional streaming support in Amazon SageMaker AI Inference for real-time speech-to-text transcription.
- Enables continuous speech processing with minimal latency instead of batch input
- Models receive audio streams and return partial transcripts simultaneously as users speak
- Eliminates need for custom WebSocket implementations and streaming protocol management
- Uses HTTP2 connection with automatic WebSocket creation to SageMaker containers
- Compatible with existing speech models like Deepgram without modifications
- Available across 32+ AWS regions globally
SageMaker AI Inference's bidirectional streaming simplifies voice agent development by providing managed infrastructure for real-time transcription, reducing deployment time from months to immediate availability.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
Nov 25
2025
2025
Introducing bidirectional streaming for real-time inference on Amazon SageMaker AI
Dec 6
2024
2024
Amazon SageMaker introduces new capabilities to accelerate scaling of Generative AI Inference
Apr 22
2026
2026
Amazon SageMaker AI now supports optimized generative AI inference recommendations
May 4
2026
2026
Amazon SageMaker AI Now Supports Capacity-Aware Inference with Automatic Instance Fallback
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.