Home icon

Improving System Resilience and Observability: Chaos Engineering with AWS FIS and AWS DLT

Blog



This article discusses how to use AWS Fault Injection Simulator (AWS FIS) and AWS Distributed Load Testing (DLT) to improve system resilience and observability. It covers a solution architecture that integrates DLT for load simulation, AWS FIS for fault simulation, and Amazon Managed Grafana for monitoring and visualization.

Specifically, the article covers:

  • Using DLT to simulate realistic traffic patterns and load on application services with JMeter scripts
  • Creating AWS FIS experiment templates to introduce controlled faults like instance termination, CPU spikes, network latency, etc.
  • Setting up an EC2 instance with InfluxDB and configuring Amazon Managed Grafana with InfluxDB and CloudWatch as data sources
  • Importing Grafana dashboards to monitor application and infrastructure metrics during load and chaos tests
  • Key metrics to monitor for resilience, infrastructure, and application performance
  • Cleaning up resources after testing
  • Conclusion on the benefits of combining AWS FIS, DLT, and Grafana for resilience testing


Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.