Home icon

How Hapag-Lloyd automated incident management using AWS Step Functions

AWS Cloud Operations Blog



The article describes how Hapag-Lloyd automated its incident management process using AWS Step Functions and Amazon OpenSearch Service to improve response times and provide better context for issue resolution.

  • The workflow begins when an application outage is detected by CloudWatch alarms
  • Step Functions orchestrates a multi-step incident response process
  • Key workflow steps include:
    • Validating the incident is persistent
    • Querying OpenSearch Service for detailed logs
    • Creating or updating Jira tickets
    • Sending notifications to product teams
    • Updating internal status pages
  • The automation helps reduce Mean Time To Respond (MTTR)
  • Provides rich contextual information to incident management and application teams

The solution demonstrates how AWS services can be used to streamline and automate complex operational processes, improving overall system reliability and response efficiency.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Jan 17
2025
Resolve your AWS incidents faster by automatically engaging AWS Managed Services
May 20
2025
How to automate incident response for Amazon EKS on Amazon EC2
Apr 15
2024
Automate incident reports from AWS Systems Manager Incident Manager
Apr 21
2026
Automated network incident response with AWS DevOps Agent

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.