A guide to capacity planning for Airflow worker pool in Amazon MWAA
Big Data Blog
This article provides a practical capacity planning framework for Amazon MWAA worker pools, using a financial services scenario with 25% DAG growth to demonstrate how to right-size infrastructure before workload increases hit production.
- Assess current peak concurrent tasks using CloudWatch RunningTasks and QueuedTasks metrics
- Calculate required workers: peak tasks ÷ tasks-per-worker × safety buffer (5-15%)
- Target 85-95% peak utilization to maintain SLA compliance and handle unexpected spikes
- Monitor five critical metrics: QueuedTasks, RunningTasks, AdditionalWorkers, Worker CPU, Task Duration
- Set CloudWatch alarms for queue depth, permanent worker detection, and SLA risk alerts
- Choose strategy: full provisioning (mission-critical), hybrid (balanced), or minimal+scaling (cost-focused)
- Conduct quarterly reviews plus trigger-based assessments when DAGs grow >10% or SLA breaches occur
Effective capacity planning prevents both under-provisioning (SLA breaches) and over-provisioning (cost overruns) by measuring current state, projecting growth, calculating needs with buffers, and monitoring continuously.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2026
2024
2024
2024
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.