The Resilient Architecture Collection
A list of my resiliency related blog posts.
Series on Resilient Architecture
Resilient systems embrace the idea that failures are typical, and that it’s entirely OK to run applications in what we call partially failing mode. While not suitable for life-critical applications, running in a partially failing mode is a viable option for most web applications. Of course, I’m not saying it doesn’t matter if your system fails. It does, and it might result in lost revenue. But, it’s probably not life-critical.
Building resilient architectures has had its ups-and-downs, some 1 am wake-up calls, some Christmases spent debugging, some “I’m done, I quit” … but most of all, it’s been an incredible learning experience and journey.
This blog post is a collection of tips and tricks that have served me well throughout this journey, and I hope they will help you well too.
Part 1: Embracing failure at scale
In part 1 of this series, I focus on the infrastructure layer, redundancy, immutability, and the concept of infrastructure as code.