Home icon

Maintaining spare capacity during host failures on AWS Outposts with dynamic monitoring

Compute Blog



The article discusses strategies for maintaining spare capacity and ensuring resiliency on AWS Outposts rack, focusing on capacity management during host failures.

  • Introduces the concept of N+M capacity planning, where N is deployed hosts and M represents hosts that can fail while maintaining workload requirements
  • Outlines three potential recovery scenarios during host hardware failures
  • Presents an automated solution using AWS services like Lambda, EventBridge, CloudWatch, and SNS to monitor and alert on capacity risks
  • Provides a sample monitoring stack with two key Lambda functions:
    • Monitoring stack manager: Creates dynamic CloudWatch alarms and generates detailed resiliency reports
    • Process alarm: Analyzes Outpost capacity and sends immediate risk notifications
  • Recommends integrating capacity monitoring to minimize potential downtime and maintain application resilience

The solution helps organizations proactively manage their AWS Outposts rack capacity and prepare for potential hardware failures.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Mar 31
2025
Asset level capacity management for AWS Outposts
Nov 18
2024
Self-service capacity management for AWS Outposts
Aug 8
2024
Enabling high availability of Amazon EC2 instances on AWS Outposts servers (Part 1)
Aug 8
2024
Enabling high availability of Amazon EC2 instances on AWS Outposts servers (Part 2)

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.