The zombie apocalypse could happen any day now…..
OK, maybe not.
If we are deploying business critical systems to Amazon Web Services (AWS), we should at least prepare for one. To do that, our system needs to be at least highly available. Some systems need to be fault tolerant.
Highly available: If the system goes down, there is the possibility of minimal interruption.
Fault tolerant: If the system goes down, there is no interruption.
The trade-off between the two is often cost. A lot of businesses are content with highly available systems. When done right, the down time can be minimal. But mission critical systems often don’t have that luxury, so they must be fault tolerant.
In the context of AWS, regions and availability zones are used to build both highly available and fault tolerant systems.
A region is a collection of data centers located in a specific geographic area. Different regions are at least one airplane ride away from each other. The regions are isolated from one another so that if one is wiped off the map, the others can continue to function.
One region in AWS is us-east-1, in North Virginia. Another is us-west-2, in Oregon. These are two different geographic regions, each containing multiple data centers.
Since two regions are geographically distinct and on separate networks, replicating data from one region to another is more expensive.
Availability Zones (AZ)
An availability zone (AZ) is one or more data centers inside a region. Each AZ is linked to the other AZs via dedicated fiber.
The us-west-2 region has three different availability zones that are at least a car drive from one another. If one of the data centers loses power, there is another in the same region that continues to operate.
Regions and availability zones are core tenets to both fault tolerant and highly available systems. Let’s go through a sample scenario, showing how we can use them to create fault tolerant and highly available systems.
Preventing zombies from killing your services
The benefits of availability zones
Let’s say we have a AWS Relation Database Service (RDS) database that is hosted in the us-west-2 region. This region has three availability zones:
But our database only exists in the us-west-2a availability zone.
Zombies have breached the perimeter of the data center at us-west-2a. They were hungry and chewed through the power lines. The data center has now lost connectivity.
Users are trying to access local news headlines from our database. They can’t get any news because our database is down.
How could we have prevented this? Multi-AZ deployments of our RDS database.
Instead of having our RDS database living in only the us-west-2a availability zone, we need to have it live in more than one zone. This is accomplished with AWS RDS, because we can turn on Multi-AZ deployments. By turning them on, we can have our primary database in us-west-2a and our backup in us-west-2b.
Multi-AZ deployments will replicate our database from one availability zone to another automatically. As our application makes updates to the database, those changes are replicated to the other availability zone with no action needed from us.
If for some reason the primary database in us-west-2a fails, RDS automatically switches to the standby in us-west-2b. No action needed from us, and our application continues to function as expected.
The benefits of regions
What if the zombie apocalypse took out the entire state of Oregon? Then the databases in that entire region are no longer connected.
This is a much more severe scenario than one data center being down, as now our entire region is gone. Multi-AZ doesn’t help us, because all the AZs in our region are down. What we need now is multi-region support.
These scenarios are often referred to as Disaster Recovery (DR) scenarios. To support a DR scenario, we need to replicate our database to another geographic region.
Luckily, we can use RDS to create read replicas that are located in separate regions. Our Multi-AZ enabled database is based in us-west-2 (Oregon), and we also have a read replica in us-east-1 (North Virginia).
If the zombies eat all of Oregon, RDS can use the replica that is in North Virginia.
Design Systems For Zombies
Regions are geographic locations that are distinct networks across the globe. Each region consists of one or more data centers that are connected via shared fiber.
If one data center (i.e. AZ) loses power, we want our applications to automatically fail over to another data center in the same region. This is termed Multi-AZ.
But, if we want our applications to fail over to another data center when the entire region is gone, we need multi-region support.
Regions and availability zones are often glossed over when people are beginning to learn AWS. But these are critical components to understand when creating highly available and fault tolerant systems.
Learn AWS By Actually Using It
If you enjoyed this post and are hungry to start learning more about Amazon Web Services, I have created a new course on how to host, secure, and deliver static websites on AWS! It is a book and video course that cuts through the sea of information to accelerate your learning of AWS. Giving you a framework that enables you to learn complex things by actually using them.