Amazon Bedrock introduces Priority and Flex inference service tiers
News
This article announces Amazon Bedrock's new Priority and Flex inference service tiers for optimizing AI workload costs and performance.
- Flex tier offers cost-effective pricing for non-time-critical workloads like model evaluations and content summarization
- Priority tier provides premium performance with up to 25% better output token latency for mission-critical applications
- Standard tier remains available for everyday AI applications with reliable performance
- Flex tier receives lower priority during high-demand periods; Priority tier receives preferential processing
- Available for OpenAI, DeepSeek, Qwen3, and Amazon Nova foundation models
Amazon Bedrock's new service tiers enable organizations to balance cost efficiency with performance requirements when scaling AI workloads.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
Nov 26
2025
2025
Amazon Bedrock introduces Reserved Service tier
Nov 18
2025
2025
New Amazon Bedrock service tiers help you match AI workload performance with cost
May 28
2026
2026
Amazon Bedrock expands support for Service Quotas
Aug 21
2024
2024
Amazon Bedrock offers select FMs for batch inference at 50% of on-demand inference price
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.