Decoupling Chaos and Delivery Pipelines
Architecting for Resilience at Velocity
Introduction
The role of CI/CD pipelines
CI/CD pipelines automate the process of building, testing, and deploying software applications, enabling faster release cycles and more reliable updates by catching issues early and reducing manual errors [1]. By automating testing and deployment, CI/CD pipelines enable software teams to release updates more frequently without sacrificing quality, availability or security. This directly serves availability objectives by reducing deployment failures and downtime. It also serves productivity objectives by eliminating tedious manual processes, providing fast feedback to developers, and enabling them to deliver value continuously [2].
As chaos engineering gains traction, many organizations consider embedding chaos experiments alongside these CI/CD pipelines to facilitate continuous resilience testing. However, this conflates very different intents between standard “tests” and exploratory “experiments” like those used in chaos engineering.
Putting chaos experiments directly into CI/CD pipelines often creates operational and cognitive challenges for organizations. Customers who have tried this approach end up with slow pipelines, frustrated engineers, and more questions than answers. By injecting…