“Frequent releases reduce risk” — this is something you hear all the time in conversations about Continuous Delivery. How exactly is this the case? It sounds counter-intuitive. Surely releasing more often is introducing more volatility into production? Isn’t it less risky to hold off releasing as long as possible, taking your time with testing to guarantee confidence in the package?
Let’s think about what we mean by risk.
What is risk?
Risk is a factor of the likelihood of a failure happening combined with the worst case impact of that failure:
Risk = Likelihood of failure × Worst case impact of failure
Therefore, an extremely low-risk activity is when failure is incredibly unlikely to happen and the impact of the failure is negligible. Low-risk activities also include those where either of these factors — likelihood or impact — is so low that it severely reduces the effect of the other.
Playing the lottery is low risk — the chance of failing (i.e., not winning) is very high, but the impact of failing (i.e., losing the cost of the ticket) is minimal, and so playing the lottery has few adverse consequences.
Flying is also low risk due to the factors being balanced the opposite way. The chance of a failure is extremely low — flying has a very high safety record — but the impact of a failure is extremely high. We fly often as we consider the risk to be very low.
High-risk activities are when both sides of the product are high — high likelihood of failure and high impact, for example, extreme sports such as free solo climbing and cave diving.
Large, infrequent releases are riskier
Rolling a set of changes into a single release package increases the likelihood of a failure occurring — a lot of change is happening all at once.
The worst case impact of a failure includes the release causing an outage and severe data loss. Each change in a release could cause this to happen.
The reaction to try and test for every failure is a reasonable one, but it is impossible. We can test for the known scenarios, but we can’t test for scenarios we don’t know about until they are encountered (the “unknown unknowns”).
This is not to say that testing is pointless — on the contrary, it provides confidence that the changes have not broken expected, known, behavior. The tricky part is balancing the desire for thorough testing against the likelihood of tests finding a failure, and the time taken to perform and maintain them.
Build up an automated suite of tests that protect against the failure scenarios you know about. Each time a new failure is encountered, add it to the test suite. Increase your suite of regressions tests, but keep them light, fast, and repeatable.
No matter how much you test, production is the only place where success counts. Small, frequent releases reduce the likelihood of a failure. A release containing as small a change as possible reduces the likelihood that the release will contain a failure.
There’s no way to reduce the impact of a failure — the worst case is still that the release could bring the whole system down and incur severe data loss, but we lower the overall risk with the smaller releases.