How Hapag-Lloyd automated incident management using AWS Step Functions
AWS Cloud Operations Blog
The article describes how Hapag-Lloyd automated its incident management process using AWS Step Functions and Amazon OpenSearch Service to improve response times and provide better context for issue resolution.
- The workflow begins when an application outage is detected by CloudWatch alarms
- Step Functions orchestrates a multi-step incident response process
- Key workflow steps include:
- Validating the incident is persistent
- Querying OpenSearch Service for detailed logs
- Creating or updating Jira tickets
- Sending notifications to product teams
- Updating internal status pages
- The automation helps reduce Mean Time To Respond (MTTR)
- Provides rich contextual information to incident management and application teams
The solution demonstrates how AWS services can be used to streamline and automate complex operational processes, improving overall system reliability and response efficiency.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2025
2025
2024
2026
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.