GCP Checklist 5 — Disaster recovery planning
As part of planning your production environment you also need to have a DR plan. Service-interrupting events can happen at any time. Your network could have an outage, your latest application push might introduce a critical bug, or you might someday have to contend with a natural disaster. When things go awry, it’s important to have a robust, targeted, and well-tested DR plan.
- Design according to your RTO/RPO values — Different parts of your application can have different RTO & RPO values
- Design for end to end recovery — Ensure that you have the process to return to your production environment replaying data updated while the DR was the primary site and also replaying logs to your logging system
- Configure security controls so they mirror the permissions in your production environment
- Check software licences for running recovery versions of any applications that require licensing
- Ensure that your CI/CD system will be able to deploy to your Recovery environment on GCP
- Ensure users can access the DR environment with the appropriate permissions
- Test your plan regularly
- Keep your DR environment up to date
- If you are comfortable inject regular failures into the production environment to simulate micro failures and to enforce recovery processes or failover to DR environment
Nice short reading list for you this time round 😃
https://cloud.google.com/solutions/dr-scenarios-planning-guide
https://cloud.google.com/solutions/dr-scenarios-building-blocks
https://cloud.google.com/solutions/dr-scenarios-for-data
https://cloud.google.com/solutions/dr-scenarios-for-applications
Accompanying Checklist:
A list of all the checklists in the series can be found here