Native Backups on Google Cloud

Alistair Grew
Appsbroker CTS Google Cloud Tech Blog
10 min readJan 26, 2023

--

Is Google Cloud Backup and DR a revolution or an evolution?

Source: https://www.atlascloud.co.uk/wp-content/uploads/2017/10/backup.jpg

Introduction

First, I have a confession to make. I care more than most people about backup and DR, in fact, I started my career managing storage arrays and data protection for a global systems integrator. To this day I still have the ports of 1556 and 13724 etched in my brain from my years as a Netbackup admin. I used to be the guy daily checking all the backups that had gone through, fixing VADP & VSS issues, and getting the 3 AM call when an urgent restore was required. Thankfully those days are now in my rearview mirror, but my concern for data has never faded.

Data is important and as such most organisations have pre-existing and well-defined solutions for operating within the on-premises landscape but a question I often get is how do I provide the same levels of protection in the cloud? I have seen many approaches adopted from this, including extending on-prem solutions such as Netbackup and Commvault into the cloud to the utilisation of snapshot and native features within products.

Highlighting a key misconception

One misconception which I thankfully don’t hear much of now is that:

In Cloud I don’t need to protect my data because my provider already does…

Whilst it is true that your provider will seek to cover their own backsides in the event of a catastrophic hardware failure they will implement several strategies to achieve this which may or may not include backing up the data. It only takes a quick glance at the shared responsibility model to see that content (aka data!) is firmly in your responsibility.

Source: https://cloud.google.com/blog/products/containers-kubernetes/exploring-container-security-the-shared-responsibility-model-in-gke-container-security-shared-responsibility-model-gke

I think it surprises people that this even includes SaaS products like Google Workspace and Microsoft 365. In fact, one of the solutions offered by my company’s group sibling CloudM is archiving for both Google and 365.

The data being your responsibility also means that if someone makes a mistake such as deleting a table, or critical files that is very much on you as the FAA recently found out. I am far from immune myself having nearly taken out one of the servers supporting the medical imaging system for Scotland with a failed storage change. I would go as far as to say that having an effective data protection strategy, that is tested, is key to providing a level of psychological safety for your team should mistakes be inadvertently made.

The native Google Cloud backup problem as I see it

What I have felt has been missing though that there has been no consistent single pane of glass view for orchestrating native Google Cloud data protection across the multitude of services on offer. Some 3rd party solutions such as Commvault do provide protection for the most common data sources specifically, Google Compute Engine (GCE), MySQL & PostgreSQL flavours of CloudSQL & Spanner. Support for others can be potentially inferred for the SQL Server CloudSQL flavour and Filestore.

What can be protected natively?

Source: https://media.makeameme.org/created/it-would-be-0qbtjq.jpg

Well a fair few things can be broadly broken down into different sections of Compute, Storage, and Databases, let’s look at each of these in turn.

Google Compute Engine (GCE)

The summary of all the GCE options is helpfully collated by Google here but in summary, the choices are a snapshot, disk clone, or imaging. Regional persistent disks are also an option but I consider this to be more a question of system availability than data protection as it wouldn’t prevent corruption for example.

Google Kubernetes Engine (GKE)

Whilst many GKE deployments are ‘stateless’, ‘stateful’ workloads do exist and in the context of GCE this is ultimately delivered through persistent disk volumes as per GCE, so basically see above…

Google Cloud Storage (GCS)

Whilst GCE and GKE are more ‘conventional’ workloads GCS is where it gets interesting. Google for their part has designed GCS for 99.999999999% (11 nines) durability with availability (depending on the exact configuration) being between 99% and 99.9% as an SLA. For data protection, there are configuration options such as object versioning for rolling back an individual object right to bucket locks for complete WORM functionality.

Filestore

Google’s managed NFS offering Filestore can be split between ‘standard’ and ‘enterprise’ with the following options:

Cloud SQL

Google’s Cloud SQL also supports backups with slight differences depending on your flavour (MySQL, Postgres, SQL Server) but simply put there are three main options:

  • Backups — Of both an on-demand and automatic variety.
  • Point in Time Recovery (PITR) — This enables you to revert to an individual transaction if required using database bin logs.
  • Import and Export — This can be done to CSV, SQL Dump or other formats depending on the database flavour used. I have also previously seen this combined with a Cloud Function to push data into a GCS bucket for longer archival purposes.

AlloyDB

For those unfamiliar AlloyDB is Google’s supercharged Postgres implementation which also has its own backup mechanisms which are enabled by default.

Cloud Spanner

Unsurprisingly Google’s Premier, ACID-compliant globally distributed databases also has protection offerings which are similar to those of Cloud SQL:

Cloud Bigtable

Google’s, low latency, wide columnar database also supports backups to protect against data corruption.

Cloud Firestore/Datastore/Firebase

Google’s popular, document-centric Firestore/Datastore/Firebase database also supports backups or import and export depending on the exact version used (Firestore or Datastore).

Cloud Memorystore

To round off the database offerings Google also supports RDB snapshots for Redis within Memorystore as well as import and export. Though there isn’t equivalent functionality for Memcached at the time of writing.

BigQuery

Google’s famous data warehousing tool also has mechanisms to protect data with ‘Time Travel’ enabling PITR recovery within the last 7 days. To protect against table deletion there are also ‘table snapshots’ with the related ‘table clones’ in preview as I write this. It is also possible to use data catalog to provide some level of recoverability as one of our engineers found out when a customer engineer accidentally deleted some data.

As you can see with the above list of 11 different products (and no doubt I have missed one or two) coverage as of the time of writing is pretty good. Bringing this back round though to the problem as I see it, each of these tools has its own management interface without central coordination (perhaps with the exception of GCE and GKE). I can see this being an increasing pain should you need to prove for example to auditors that you have the required protection in place.

Is Google Cloud Backup and DR tool solving this headache?

For those who don’t know Google announced a new Backup and DR service in September 2022. This is the result of the Actifio acquisition made in December 2020 and fundamentally it seems to be an update and rebadge of the ‘GO’ product. What I am interested in though is what this tool allows me to do that I wasn’t able to do before.

Initial Setup Backup and DR Service

After spinning up a sandbox project with a default network and a handful of GCE instances I launched Backup and DR Service. I was then prompted to enable the API (backupdr.googleapis.com) and once enabled I was presented with this video:

One of the first quirks I did notice is that I was limited on which regions I could deploy the management interface into. Thankfully I was europe-west1 (Belgium) was one of the options but europe-west2 (London) wasn’t so sovereignty could be an issue for some customers. Though the appliance itself looks to be able to be placed anywhere. Once I had selected all this the console kicked off the provisioning process which took about 25 minutes (though it can take up to 40 apparently) before I could launch into the management interface:

My impressions centred around two points:

  • The interface is quite clean with an initial tutorial indicating where everything is located.
  • Protection is very VM/Instance Centric the two ‘Google Cloud options’ being VMware Engine (GCVE) and GCE. A multitude of other applications are supported but require agents to be installed.

Going beyond the management interface and looking at the appliance, this is simply deployed as a GCE instance.

Drilling in a little further and looking at the disk configuration clearly shows the Actifio heritage:

Anyway, let’s try and get a workload protected!

Protecting a GCE Workload

As I mentioned previously I spun up a handful of VM instances so I was keen to see how easy it was to protect these. In the main dashboard, I selected Compute Engine which launched a new interface that should, in theory, have listed my instances but it didn’t until I refreshed by looking at other zones, even then I ended up with a graphical glitch.

After this, I tried a couple of different things but in the end, had to go back to watching Google’s example video as I didn’t find the interface particularly intuitive. The first step is to create a ‘Backup Policy’ which defines how you want to protect the workloads.

Then you apply this policy to the instances themselves:

It takes a few moments for the configuration to apply but then everything should be good to go!

Who doesn’t love a dashboard?

Fast forwarding to the following morning I am greeted with what every sysadmin wants to see from their backup tool. All green! I especially like the ‘Humming along nicely’ touch :)

This single pane of glass view is how backup applications should look in my opinion, sharing a level of familiarity with Google’s VM Manager tool as well. If you need a little bit more detail the monitor tab allows you to view a more detailed job breakdown.

The whole point of backups is being able to recover…

I’m not going to go into huge details on how this is achieved but instead point at Google’s pre-existing Youtube video needless to say rolling back to the snapshot was straightforward though.

The restore job also appears within the monitor tab:

My thoughts?

I subtitled this post “Is Google Cloud Backup and DR a revolution or an evolution?” I think it is fair to say that at the moment it is an evolution. There are some things I really like about the solution and some that I think need further development, let's break these down starting with the positive:

  • Google Cloud Native hosting — the solution feels ‘cloud native’ perhaps with the slight exception of the appliances though being able to reach inside instances is always going to require some network connectivity and agents.
  • Single Pane of Glass — I love the high-level overview of status along with the ability to drill down into individual jobs.
  • Integrated billing — Billing for the solution is straightforward and integrated into other Google Cloud billing so you are only managing a single vendor.

Now onto the areas for improvement:

  • Some of the interfaces still need their Google ‘facelift’ notably launching the reporting tab gives a very different look and feel.
  • The solution is still very Instance/VM centric. What I would love to see is integration with most (if not ideally all) of the protection options I mentioned earlier in this post, especially with the likes of CloudSQL for example. I believe adding additional product integration would take this and make it a differentiating and compelling tool for Google Cloud.
  • There are still some regional restrictions on the management interface which could be a barrier for some customers with sovereignty requirements.

In all, I think the solution is capable though not quite compelling enough in its feature set yet to compete with ‘extended’ on-prem solutions like Netbackup or Commvault. I do think it should promise though and will be watching the roadmap closely.

Anyway, that’s all for today folks, until next time keep it Googley ;)

About CTS

CTS is the largest dedicated Google Cloud practice in Europe and one of the world’s leading Google Cloud experts, winning 2020 Google Partner of the Year Awards for both Workspace and GCP.

We offer a unique full stack Google Cloud solution for businesses, encompassing cloud migration and infrastructure modernisation. Our data practice focuses on analysis and visualisation, providing industry specific solutions for; Retail, Financial Services, Media and Entertainment.

We’re building talented teams ready to change the world using Google technologies. So if you’re passionate, curious and keen to get stuck in — take a look at our Careers Page and join us for the ride!

--

--

Alistair Grew
Appsbroker CTS Google Cloud Tech Blog

GCP Architect based in the Manchester (UK) area. Thoughts here are my own and don’t necessarily represent my employer.