Amazon SageMaker Serverless Inference is now generally available

News

Amazon has announced the general availability of Amazon SageMaker Serverless Inference across 21 AWS Regions, offering a simplified way to deploy machine learning models for inference.

Allows deployment of ML models without configuring infrastructure
Automatically provisions, scales, and manages compute capacity
Pay only for compute capacity used, billed by millisecond
Ideal for applications with intermittent or unpredictable traffic
New features include SageMaker Python SDK support and increased concurrent invocation limit to 200
Can be created via AWS console, SDK, CloudFormation, or CLI

The service enables machine learning inference without server management, providing a flexible and cost-effective solution for deploying ML models.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Jun 17
2026

Amazon SageMaker AI Async Inference now supports inline request payloads

Jan 8
2025

Unlock cost-effective AI inference using Amazon Bedrock serverless capabilities with an Amazon SageMaker trained model

Dec 11
2024

Amazon SageMaker AI announces availability of P5e and G6e instances for Inference

Feb 16
2026

Announcing Amazon SageMaker Inference for custom Amazon Nova models

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Amazon SageMaker Serverless Inference is now generally available

Related articles