Best Practices for deploying Google Cloud VMware Engine Protected

Published in

Google Cloud - Community

5 min readJun 21, 2024

In the era of cloud computing, ensuring the availability and recoverability of your critical workloads is paramount. Google Cloud VMware Engine (GCVE) offers a seamless way to migrate and run your VMware workloads in the cloud. However, to truly safeguard your environment, implementing a robust backup and disaster recovery (DR) strategy is non-negotiable. In this blog post, we’ll delve into the best practices for deploying Google Cloud Backup and DR Service (GCBDR) included with the GCVE Protected offering.

When deciding to use GCBDR in conjunction with GCVE the first and most obvious question that needs to be answered before making deployment decisions is “in what context do you plan to use GCBDR?”. Are you planning to use GCBDR solely as a backup service, for local restores? Do you plan to use GCBDR for protection against regional outages with the idea of restoring backups in another region if something were to happen? In most cases, the customers I’ve worked with are using GCBDR to protect against both scenarios above, which then informs us of how we are going to deploy the GCBDR management console.

If you’ve never deployed GCBDR, there are two major components to the service: the management console (Google Cloud hosted) and the backup/recovery appliance (self-hosted on GCE). In a scenario where you would need to restore a backup in a secondary region, you will want to place the management console (screenshot below) in the secondary region. In this blog article we will assume the GCVE production environment is in US-Central1 and DR is in US-East4. The management console is purely a control plane construct so communication between the management console and backup/recovery appliance will be constant but will require minimal data transfer.

Later on down the GCBDR deployment wizard you will be asked to provide the region where the backup/recovery appliance will be deployed in. The region you should choose is the same region where the GCVE environment lives, in this case US-Central1. We want to colocate the backup/recovery appliance in the same region as the GCVE environment because the appliance will be in charge of moving the VMware snapshots from the GCVE environment into the backup and DR environment. The appliance will also act as a storage mount point for any backup restores into the VMware environment, this means for both performance and cost (network egress) reasons we want to have the appliance as close to the GCVE environment as possible.

Keep in mind you can deploy multiple backup/recovery appliances and they can all be managed by a single management console. You may want to deploy a second backup/recovery appliance in US-East4 after the initial setup is complete to ensure a second appliance is alive and waiting during a DR event — note this will increase your cloud costs and is not required, during a non DR scenario. When choosing the backup/recovery appliance there are different “t-shirt” size appliance configurations, if the goal is to only backup crash consistent, VMware snapshots, the “Basic…” (screenshot below) should be sufficient. If the goal is to support both crash consistent, VMware snapshots, and application consistent database backups, you’ll want to choose the “Standard for databases…” option.

Once you’ve selected all those options and kicked off the deployment process, there is normally a 40–50 minute wait as all the components are deployed. During this time we will want to deploy the Google Cloud Storage (GCS) bucket where our GCVE backups will be stored. This leads us to our final decision point. There are many options when deploying GCS buckets starting with the bucket location. You can read more about the bucket locations in the link provided, but knowing that we want to protect GCVE backups for the use case of region to region DR, the best practice is to use a dual-region bucket. The dual-region bucket allows us to manually configure the two regions where data will be stored (US-Central1 and US-East4). The biggest benefit to a dual-region bucket for the DR use case is that, in the event of a DR, the data will be pulled from the region closest to the DR GCVE deployment which can increase performance and negate GCS multi-region egress costs. Keep in mind that dual-region buckets are the most costly GCS buckets, so there is a tradeoff to having your data in two deterministic locations.

Keep in mind there is one additional option, highlighted in red, that you can choose for a dual-region bucket and that is “turbo replication”. Without turbo replication, GCS default replication dictates that 100% of newly created objects will be replicated within 12 hours, and 99.9% of newly created objects within 1 hour. If your DR RPO requirement dictates fewer than 12 hour RPO, enabling turbo replication is recommended.

Once we’ve determined the bucket location, we also need to choose a storage class for this bucket. This is a far easier decision and ties back to your backup and DR retention requirements. GCS best practices dictates for retention fewer than 30 days, choose Standard class, for retention 30–90 days, choose Nearline class, and for retention longer than 90 days choose Coldline class. In scenarios where retention is in the years, mostly businesses faced with compliance requirements, Archive class can also be an option. Keep in mind for most GCS storage classes (NL, CL, and Archive) there are early deletion charges.

Hopefully creating that GCS bucket above, didn’t take up all the time needed for Google Cloud to deploy your GCBDR components which leaves us with one last thing. Because we are using GCBDR to protect GCVE workloads and because GCBDR will use NFS to mount restored backups to GCVE hosts, we will need to make sure that the VPC firewall rule (created by the GCBDR deployment process) is edited to support NFS transport protocols (TCP ports: 111, 756, 2049, 4001, 4045, UDP ports: 111, 756, 2049, 4001, 4045). This is outlined in the deployment best practices guide here.

This blog article was meant to guide the decisions needed to be made prior and up to the point of GCBDR deployment. I plan to have a second blog article that talks about best practices for configuring GCBDR for GCVE protected. I will leave you all with one last helpful resource, our GCBDR product team has put out a handful of GCBDR deployment walk through videos hosted on youtube.

Best Practices for deploying Google Cloud VMware Engine Protected

Written by Andres Vigil