Enhancing auto scaling resilience by tracking worker utilization metrics

Compute Blog

This article explains how to improve auto scaling resilience by tracking worker utilization instead of system resources like CPU.

Traditional CPU-based scaling fails when resource consumption doesn't correlate with application capacity
Worker utilization measures ratio of active work to available processing capacity
Formula: total work (backlog + in-flight) divided by total workers
Works across mixed instance types, variable latencies, and evolving applications
Implement using CloudWatch metric math in target tracking scaling policies
Recommended starting target utilization value is 0.7
Approach works for SQS-based, ECS, and synchronous API workloads
Enables independent optimization of availability and cost without policy changes

Worker utilization-based scaling provides more resilient auto scaling that automatically adapts to capacity constraints regardless of instance type or application changes.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Jun 30
2025

SAP Server Auto Scaling: Optimize Infrastructure Costs While Meeting Workload Demands

Sep 4
2024

Assess Resilience at Scale by using Amazon QuickSight and Amazon Resilience Hub

Nov 22
2024

Amazon EC2 Auto Scaling introduces highly responsive scaling policies

Jan 14
2025

Designing an integrated production monitoring and analytics platform to improve Jobs Per Hour and rework ratio

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Enhancing auto scaling resilience by tracking worker utilization metrics

Related articles