Home icon

Improve operational visibility for inference workloads on Amazon Bedrock with new CloudWatch metrics for TTFT and Estimated Quota Consumption

Machine Learning Blog



This article announces two new CloudWatch metrics for Amazon Bedrock: TimeToFirstToken and EstimatedTPMQuotaUsage, providing server-side visibility into streaming latency and quota consumption for inference workloads.

  • TimeToFirstToken measures latency from request receipt to first response token generation for streaming APIs
  • EstimatedTPMQuotaUsage tracks tokens-per-minute quota consumed, accounting for burndown multipliers and cache tokens
  • Both metrics automatically emitted at no cost with no API changes or opt-in required
  • Available in AWS/Bedrock CloudWatch namespace with ModelId dimension filtering
  • Supports cross-Region inference profiles for geographic and global configurations
  • Enable proactive alarms, SLA baselines, and capacity planning without client-side instrumentation
  • Quota formula varies by throughput type: on-demand applies output token burndown; provisioned throughput applies cache weighting

These metrics eliminate the need for custom instrumentation and help teams prevent throttling and performance degradation in production AI workloads.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Sep 18
2025
Monitor Amazon Bedrock batch inference using Amazon CloudWatch metrics
Jun 25
2024
Improve visibility into Amazon Bedrock usage and performance with Amazon CloudWatch
May 5
2026
Amazon ElastiCache adds thirteen new Amazon CloudWatch metrics for network capacity planning and engine diagnostics
Jun 1
2026
Amazon Bedrock adds Amazon CloudWatch metrics for OpenAI- and Anthropic-compatible APIs

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.