Creating your own Chaos Monkey with AWS Systems Manager Automation

Chaos Engineering on AWS

Adrian Hornsby
The Cloud Architect

--

I’d like to express my gratitude to my colleagues and friends Jason Byrne and Matt Fitzgerald for their valuable feedback.

In a recent post, I explained how to use AWS SSM Run Command to inject failures on EC2 instances. SSM Run Command is well-suited to execute custom scripts on EC2 instances, especially to inject latency or blackouts on the network interface, do resource exhaustion of CPUs, memory, and IO.

However, we need more than that. Failure injection should target resources, network characteristics and dependencies, applications, processes and service, and also the infrastructure.

We also need to have a broad set of controls and capabilities to perform chaos experiments safely. We might want to:

  • Execute commands and scripts directly into EC2 instances.
  • Invoke Lambda functions to run custom scripts.
  • Orchestrate several failure injections to form chaos scenarios.
  • Schedule them for execution at specific times.

--

--

Adrian Hornsby
The Cloud Architect

Principal System Dev Engineer @ AWS ☁️ I break stuff .. mostly. Opinions here are my own.