Firestore Backups the easy way with Cloud Workflows

Márton Kodok
Feb 15 · 7 min read

Backup every night your Firestore collections to secure Cloud Storage the easy way with Cloud Workflows, you don’t need to be a developer to setup the steps.

Firestore Backups the easy way with Cloud Workflows
Firestore Backups the easy way with Cloud Workflows
Firestore Backups the easy way with Cloud Workflows

Database Backups! We know how important they are, a wrong click and someone could delete your collection or the entire database. In the case of a Disaster Recovery Plan is activated you need to have your backups to resume business operation.

Let’s make sure your Firestore/Datastore collections backup every night to secure storage.

Introduction

In this article, we are going to orchestrate the automated backups via Cloud Workflows, we will store the exports in Cloud Storage, and we will trigger the workflow with Cloud Scheduler. These steps are fully managed and serverless, easy to setup by non-developers as well. Your project must have billing enabled.

Steps

  1. Create the Cloud Workflow definition to execute Firestore export API call
  2. Setup IAM permissions to execute the Workflow
  3. Setup the nightly invocation via Cloud Scheduler
  4. Run the scheduler to see the action.

You don’t need to be a developer to set up the steps.

Step 1: Create a Cloud Storage bucket

To create a bucket go to Cloud Console, and locate Cloud Storage dashboard. To create a bucket you need to:

  • Name your bucket: eg: projectname_bucket_for_backups
  • Choose where to store your data: Multi-region
  • Choose a default storage class: Standard
  • Choose how to control access to objects: Uniform
  • Advanced settings: Set a retention policy
  • Retain objects for 1 month.

Now that we have a bucket, the most important thing to understand is the last step.

We have set a retention policy to specify the minimum duration that this bucket’s objects must be protected from deletion or modification after they’re uploaded for 1 month. Having this option if an account like ours or other developers are compromised and the hackers want to wipe out the backups, they won’t be able to do so, as we have set them to be retained for 1 month. So there is a good chance that even if we are on a holiday or offline trip, if our project is hacked, in 30 days you are noticing it, and you have access to your backups.

On top of the minimum setup, you could setup Lifecycle rules, like

  • Set to Coldline 7+ days since object was updated
  • Delete object 365+ days since object was updated

Remember your bucket name for later use - you need to add the gs:// prefix in order to be a path example: gs://projectname_bucket_for_backups

Step 2: Create the Cloud Workflow definition to execute Firestore export API call

What is Cloud Workflows?

  • Cloud Workflows lets you define pipelines and orchestrate steps using HTTP based services
  • Integrate any Google Cloud API, SaaS API, or private APIs
  • Out of the box authentication support for Google Cloud products
  • Fully managed service — requires no infrastructure or capacity planning
  • Serverless with Pay-per-use pricing model
  • Declarative workflow language using YAML syntax

Cloud Workflows to execute Firestore exports/backups:

As you see in this firestoreExportDatabase.yaml file, we have an initialize step, where we have the project automatically read from the environment, the Firestore database id (default) and the firestoreBackupBucket where exports/backups will be written.

Note: Right now Firestore users cannot generate their own databaseIds, so the default database is currently the glaringly literal string: (default), and yes, you have to include the parentheses.

You need to edit the sample to have your own Storage Bucket path added to this snippet. The rest of the YAML script doesn’t need any modification. So edit line 5, and you are good to go.

To define a workflow go to Cloud Workflows page.

  • You will be prompted to Enable the Cloud Workflows API if you haven’t done so for your project. Make sure after you enabled the API, you open the console again.
  • On the Cloud Workflow Dashboard, hit Create
  • Set a workflow name and description: firestoreExportDocuments
  • Choose region: us-central1
  • You will notice there is a service account preselected for you, remember that. It may have the form of
    xxxxxxxxxxx-compute@developer.gserviceaccount.com
  • On the second page, paste the above snippet, make sure you edit the path to your bucket on line 5, which you created previously: gs://projectname_bucket_for_backups
  • By clicking Deploy, your workflow will get deployed.

At this point in your Workflows page, you should have your workflow in the list. If you want to run it this time to check for syntax error but be aware the workflow will fail to execute Firestore backups as we didn’t set service account permissions that authorize for Firestore/Datastore export/backups calls.

Step 3: Setup IAM permissions to execute the Cloud Workflow

In the previous step when you defined your Workflow the service account needs permission.

We need to authorize to be able to do Firestore/Datastore exports and to write to Cloud Storage.

Go to IAM Permissions page, and identify the service account from the list. Choose from the right menu the Edit option.

Add the following permissions:

  • Cloud Datastore Import Export Admin — to have Full access to manage imports and exports.
  • Storage Object Creator — to have Access to create objects in GCS.
  • Workflow Invoker — to have Access to execute workflows and manage the executions.

Note: You can define a specific service account just for this task, or reuse the one that is the default “compute” service account. The defualt one also has an “Editor” role. Any roles that are on the service account leave them there.

Image for post
Image for post
IAM page capture to set permissions

At this time, you can execute your workflow. The workflow status will show succeeded status when the workflow was able to trigger the Firestore export/backup process. As the export process takes time, based on your database size it can very to 2–15 minutes until you see in Cloud Storage a folder with the date of execution. This confirms the output was created.

Step 4: Setup the nightly invocation via Cloud Scheduler

To create a scheduled job, hit Create Job:

  • Use a name and description for your scheduler eg: workflows-datastore-export-backup
  • For frequency for midnight trigger use this syntax: 0 0 * * *
  • To generate complex trigger syntaxes see: https://crontab-generator.org/
  • In the Target selector choose: HTTP as method choose POST
  • Enter the below URL:
  • https://workflowexecutions.googleapis.com/v1beta/projects/<PROJECT-ID>/locations/us-central1/workflows/<WORKFLOW_NAME>/executions
  • You need to edit the above URL to replace the placeholders.
  • Now, permissions. In Show more section, configure Auth header: OAuth, and add your service account previously used in Step 3, and for Scope use: https://www.googleapis.com/auth/cloud-platform
  • Leave other selections with their default selections.

PROJECT_ID — you will find it from the url eg: plutoapple-96478
WORKFLOW_NAME — is the name of the workflow that you want to trigger eg: firestoreExportDocuments (or the name you‘ve given to your Workflow)

Image for post
Image for post
Cloud Scheduler to launch nighly a Cloud Workflow execution

Step 6: Run now the scheduler to see in action

The whole operation can take based on your data size between 2–15 minutes.

Exports/backup incur costs, as every document in a collection is parsed and read in order to create an export/backup. As Firestore is a serverless product all these operations count as “reads” and will be part of your monthly bill. Based on this information, you can set the frequency of the backup mechanism, it could be once a day, or once a week, depending on your organization’s policies and assumed risks in case of an emergency.

Image for post
Image for post
The bucket folder and output files from Firestore/Datastore exports/backups

The output format is not self readable, files are packet into many parts and there is a manifest file that resembles the schema and format. To import/restore backups to a Firestore/Datastore instance, you can do that manually from the Datastore UI, but this should be done by your developers.

Wrap Up

As it’s serverless no maintenance of SDK tools, no updates to libraries are involved, and even a non-developer can set it up.

If you are a developer, we recommend using VSCode as there you can set up the GCP Project Switcher extension, and also to define IDE tasks to automate, deploy, execute, and describe execution.

Feel free to reach out to me on Twitter @martonkodok or read my previous posts on medium/@martonkodok

Google Cloud - Community

Google Cloud community articles and blogs

Márton Kodok

Written by

Speaker at conferences, a Google Developer Expert top user on Stackoverflow, software architect at REEA.net, co-founder IT Mures, life-long learner, mentor

Google Cloud - Community

A collection of technical articles and blogs published or curated by Google Cloud Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Márton Kodok

Written by

Speaker at conferences, a Google Developer Expert top user on Stackoverflow, software architect at REEA.net, co-founder IT Mures, life-long learner, mentor

Google Cloud - Community

A collection of technical articles and blogs published or curated by Google Cloud Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store