Home icon

Amazon SageMaker Serverless Inference is now generally available

News



Amazon has announced the general availability of Amazon SageMaker Serverless Inference across 21 AWS Regions, offering a simplified way to deploy machine learning models for inference.

  • Allows deployment of ML models without configuring infrastructure
  • Automatically provisions, scales, and manages compute capacity
  • Pay only for compute capacity used, billed by millisecond
  • Ideal for applications with intermittent or unpredictable traffic
  • New features include SageMaker Python SDK support and increased concurrent invocation limit to 200
  • Can be created via AWS console, SDK, CloudFormation, or CLI

The service enables machine learning inference without server management, providing a flexible and cost-effective solution for deploying ML models.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Jan 8
2025
Unlock cost-effective AI inference using Amazon Bedrock serverless capabilities with an Amazon SageMaker trained model
Dec 11
2024
Amazon SageMaker AI announces availability of P5e and G6e instances for Inference
Feb 16
2026
Announcing Amazon SageMaker Inference for custom Amazon Nova models
Nov 22
2024
Amazon SageMaker Inference now supports G6e instances

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.