Home icon

Enhancing auto scaling resilience by tracking worker utilization metrics

Compute Blog



This article explains how to improve auto scaling resilience by tracking worker utilization instead of system resources like CPU.

  • Traditional CPU-based scaling fails when resource consumption doesn't correlate with application capacity
  • Worker utilization measures ratio of active work to available processing capacity
  • Formula: total work (backlog + in-flight) divided by total workers
  • Works across mixed instance types, variable latencies, and evolving applications
  • Implement using CloudWatch metric math in target tracking scaling policies
  • Recommended starting target utilization value is 0.7
  • Approach works for SQS-based, ECS, and synchronous API workloads
  • Enables independent optimization of availability and cost without policy changes

Worker utilization-based scaling provides more resilient auto scaling that automatically adapts to capacity constraints regardless of instance type or application changes.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Jun 30
2025
SAP Server Auto Scaling: Optimize Infrastructure Costs While Meeting Workload Demands
Sep 4
2024
Assess Resilience at Scale by using Amazon QuickSight and Amazon Resilience Hub
Nov 22
2024
Amazon EC2 Auto Scaling introduces highly responsive scaling policies
Jan 14
2025
Designing an integrated production monitoring and analytics platform to improve Jobs Per Hour and rework ratio

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.