Announcing Amazon SageMaker Inference for custom Amazon Nova models

AWS News Blog

This article announces the general availability of Amazon SageMaker Inference for custom Amazon Nova models, enabling production-grade deployment and scaling of fine-tuned Nova models.

Deploy custom Nova Micro, Lite, and 2 Lite models with full-rank customization capabilities
Supports EC2 G5, G6, and P5 instances with auto-scaling based on 5-minute usage patterns
Configure context length, concurrency, batch size, and other inference parameters
Reduce inference costs through optimized GPU utilization versus P5 instances
Enable end-to-end customization journey from training to managed inference deployment
Support for streaming and non-streaming real-time inference plus batch processing
Available in US East (N. Virginia) and US West (Oregon) regions
Pay-per-hour billing with no minimum commitments

SageMaker Inference for custom Nova models provides production-ready infrastructure for deploying and scaling customized Nova models with flexible configuration options and cost optimization.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Nov 25
2024

Amazon SageMaker launches Multi-Adapter Model Inference

Jul 16
2025

Announcing Amazon Nova customization in Amazon SageMaker AI

Jul 9
2024

Amazon SageMaker introduces a new generative AI inference optimization capability

Nov 26
2025

Evaluate models with the Amazon Nova evaluation container using Amazon SageMaker AI

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Announcing Amazon SageMaker Inference for custom Amazon Nova models

Related articles