Disaster Recovery on Google Cloud for Data (Part 2)

Get Cooking in Cloud

Priyanka Vergadia
Google Cloud - Community
6 min readJan 16, 2020

--

Introduction

Get Cooking in Cloud is a blog and video series to help enterprises and developers build business solutions on Google Cloud. In this second miniseries I am covering Disaster Recovery on Google Cloud. Disasters can be pretty hard to deal with when you have an online presence. In these articles, we have been elaborating on how to deal with disasters like earthquakes, power outages, floods, fires etc.

Here are all the articles in this miniseries, for you to checkout.

  1. Disaster Recovery Overview
  2. Cold Disaster recovery on Google Cloud for on-premise applications
  3. Warm Disaster recovery on Google Cloud for on-premise applications
  4. Hot Disaster recovery on Google Cloud for on-premise applications
  5. Cold Disaster recovery for applications in Google Cloud
  6. Warm Disaster recovery for applications in Google Cloud
  7. Hot Disaster recovery for applications in Google Cloud
  8. Disaster recovery on Google Cloud for Data: Part 1
  9. Disaster recovery on Google Cloud for Data: Part 2 (This article)

Since data is the most important part of any application’s recovery, I thought talking specifically about “data” would make sense. Hence, I am focusing the last two articles on DR for data. In the previous article you learned to plan for data recovery when production environment is on-premise or on another cloud. In this one, you will learn to plan for data recovery when production environment is on Google Cloud. So, read on!

What you’ll learn

  • Data backup is important
  • What does data recovery entail?
  • Data backup and recovery
  • Database backup and recovery

Prerequisites

  • Basic concepts and constructs of Google Cloud so you can recognize the names of the products.
  • Read the overview article for DR related definitions.

Check out the video

Disaster Recovery on Google Cloud for data that is in Google Cloud production environment

Data backup is important

Mane Event Planning

Mane-Event-Planning” is an e-commerce event planning company that currently runs their infrastructure on Google Cloud so the disaster recovery environment also runs on Google Cloud. Just with any other online business data is a critical piece of their application, let’s dive deeper into it and help Mane-Event-Planning with some strategies to avoid losing data during a disaster!

What does “Data” recovery entail?

You can only recovery data if you have backed it up somewhere. But what do backups mean when it comes to Data?

The term backup in regards to “Data” covers two scenarios:

  • Data backups: Backing up data alone involves copying a discrete amount of data from one place to another to recover from corruption or when production is down.
  • Database backups: DB backups, are slightly more complex because they typically involve recovering to a point-in-time. Hence, we need to not just consider backing up the DB, but also consider backing up transaction logs and then applying them to the DB backup during recovery.

Now that we have the basic understanding of data and database backups for DR, let’s consider “Mane-Event-Planning’s” scenario and how they can set up DR specifically for Data.

Data Backup and Recovery

Mane-Event-Planning have a tiered storage pattern on Google Cloud with persistent disk attached to compute engine. So data backups are simple, they just migrate data to a low cost tier like Nearline or Coldline storage since the requirement to access the backed-up data is, less likely.

Typical Tiered Storage pattern in Google Cloud

Database Backup and Recovery

Mane-Event-Planning can use a number of strategies to implement a process to recover a database system within Google Cloud. Let’s look at what databases they use today. They have:

  • One self managed MySQL database deployed on Compute Engine
  • Cloud Bigtable and BigQuery which are managed databases.

Managed Databases

Managed DBs are designed for scale, Bigtable provides regional replication which provides higher availability than a single cluster, additional read throughput, and higher durability and resilience in case of zonal failures.

BigQuery is used here to archive data which is a great cost effective storage for long-term, since the storage price drops by 50% if the data does not change for 90 days. Best part about using BigQuery is that it is replicated by default, but don’t forget that this does not save you from data corruption by an erroneous update.

BigQuery storage cost reduces after 90 days if the data is untouched.

For more on dealing with data corruption or if you are using any other managed Google cloud databases like Cloud Spanner, Cloud Composer or Cloud Datastore.

Self Managed Databases

Now, let’s talk about that one self managed MySql DB that Mane-Event-Panning has deployed on Compute Engine.

Steps to be taken “before” disaster

  • Create a VPC network
  • Create a custom image that’s configured with the application service. As part of the image, make sure a Persistent Disk is attached for data being processed.
  • Create a snapshot from the attached Persistent Disk and configure a startup script to create a persistent disk from the latest snapshot and to mount the disk.
  • Then Create an instance template from the image we just created.
  • Using this instance template, configure a regional managed instance group with a target size of one.
  • Make sure the health checks are configured at the Managed Instance Groups
  • Configure internal load balancing using the regional managed instance group
  • Configure a scheduled task to create regular snapshots of the persistent disk.

Steps to be taken “during/after” disaster

Well, Mane-event-planning does not need to initiate any failover steps, because they occur automatically. That is the best part of the default HA features available in Google cloud. In the event of a zonal failure, the regional instance group launches a replacement instance in a different zone in the same region. A new persistent disk is created from the latest snapshot and attached to the new instance.

Steps to be taken before and after the disaster

In the event a replacement database instance is needed, this configuration:

  • Automatically brings up another database server of the correct version
  • Attaches a persistent disk that has the latest backup and transaction log files
  • Minimizes the need to reconfigure clients that communicate with the database server
  • Ensures that the same Google Cloud Security controls (IAM policies, firewall settings) apply to the recovered database server.

To learn more about these components, check out the previous article on cold disaster recovery for applications on Google Cloud.

Well, there you have it. If you have a production application deployed in Google cloud and need to set up Data recovery, then hopefully you learned some strategies to apply in your specific scenario!

Conclusion

If you have a production application deployed in Google Cloud and need to set up Data recovery, then hopefully you learned some strategies to do that! Stay tuned more articles in the Get Cooking in Cloud series here.

Next steps

--

--

Priyanka Vergadia
Google Cloud - Community

Developer Advocate @Google, Artist & Traveler! Twitter @pvergadia