Announcing Amazon SageMaker Inference for custom Amazon Nova models
AWS News Blog
This article announces the general availability of Amazon SageMaker Inference for custom Amazon Nova models, enabling production-grade deployment and scaling of fine-tuned Nova models.
- Deploy custom Nova Micro, Lite, and 2 Lite models with full-rank customization capabilities
- Supports EC2 G5, G6, and P5 instances with auto-scaling based on 5-minute usage patterns
- Configure context length, concurrency, batch size, and other inference parameters
- Reduce inference costs through optimized GPU utilization versus P5 instances
- Enable end-to-end customization journey from training to managed inference deployment
- Support for streaming and non-streaming real-time inference plus batch processing
- Available in US East (N. Virginia) and US West (Oregon) regions
- Pay-per-hour billing with no minimum commitments
SageMaker Inference for custom Nova models provides production-ready infrastructure for deploying and scaling customized Nova models with flexible configuration options and cost optimization.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2024
2025
2024
2025
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.