Photo by Matthew Waring on Unsplash

How to use Terraform to schedule backups for your Google Cloud SQL database

Laure Retru-Chavastel
inato
Published in
9 min readMar 10, 2020

--

You may be thinking “but Google SQL database instances already provide automated and on-demand backups, why would you need to use terraform to deal with it?”. Well, the issue is that if your database instance is deleted, for instance after a manual mistake or a Google Cloud SQL outage, so are your backups!

This is why we decided to save those backups in a google storage bucket, and you will soon discover that this process is really easy to do with Terraform.

Prerequisites :

  • Have a project in Google Cloud Platform (GCP)
  • Have a Cloud SQL database defined in GCP
  • Install Terraform CLI on your computer (see documentation here, or simply run brew install terraform if you are a mac user)
  • Install gcloud CLI

Here are the steps we are going to follow in this article :

  1. Setup the terraform config for your Google Cloud project
  2. Import your database into terraform
  3. Create a bucket to store your backups
  4. Create a cloud function to store a compressed dump of your database in this bucket
  5. Create a cloud scheduler to trigger a daily backup of your database

1. Setup the terraform config for your Google Cloud project

  • Run gcloud init and gcloud auth application-default login to allow terraform to benefit from your access rights when you run terraform commands locally
  • From your console, create a new bucket (its name has to be unique among all the buckets that exist in GCP so we suffix it with the project id for this purpose) :
PROJECT_ID="your-project-id"
BUCKET_NAME="gs://terraform-state-$PROJECT_ID"
gsutil mb -p $PROJECT_ID $BUCKET_NAME
gsutil versioning set on $BUCKET_NAME
  • Go in GCP and check that a bucket terraform-state-your-project-id has been created
  • In your project repository, create a folder terraform with the following architecture :
terraform/
production/ // I named it after the name of a specific environment
// but you can name it as you like
  • Create a file production/main.tf :
terraform {
backend "gcs" {
bucket = "terraform-state-your-project-id"
prefix = "terraform/state"
}
}
provider "google" {
region = "europe-west1"
zone = "europe-west1-d"
}

You may be wondering “but what is this terraform state”? This state is a sort of screenshot of your resources configuration that you are about to link with Terraform. It will use this state to make changes to your infrastructure.

An interesting advantage of storing this state in a bucket is that everyone will share the same configuration. Moreover, if you are editing the state, Terraform will lock it so that no one else would be able to edit it at the same time.

  • Navigate in your console into your production terraform folder and run terraform init
  • Go in GCP and check that the bucket has a file terraform.tfstate which contains the terraform state

2. Import your database into Terraform

Now that Terraform is ready, we can add your database in its state.

  • Create a production/variables.tf file :
variable "project_id" {}
  • Create a production/terraform.tfvars file :
project_id = "your-project-id"
  • In production/main.tf, add the database resource config :
resource "google_sql_database_instance" "your-database" {
project = var.project_id
name = "your-database"
database_version = "POSTGRES_9_6"
region = "europe-west1"
settings {
tier = "db-custom-16-106496"
}
}

Here, I specified all the required fields of google_sql_database_instance (see documentation) but also the project field, as it is a good practice to be sure that your resource is going to be linked to the right project.

If you don’t know if these values are adapted to your database, don’t worry, Terraform won’t change the properties of your database without your consent!

  • In your terminal, navigate to the folder containing your files and import your database config into Terraform:
terraform import google_sql_database_instance.your-database projects/your-project-id/instances/your-sql-database-instance-name

This command will tell Terraform “the resource called google_sql_database_instance.your-database that I have defined in main.tf is linked to the real resource that you can find here: projects/your-project-id/instances/your-sql-database-instance-name”.

  • Let’s run terraform plan to check if there are any differences between the configuration we defined and your real database configuration
  • If there are, edit your configuration in production/main.tf. For instance, if the region of your database is in fact “us-east1” instead of “europe-west1”, change the value of the region field in your configuration.
  • Repeat the two previous steps until the command terraform plan displays: “No changes. Infrastructure is up-to-date.”

3. Create a GCP bucket to store your backups

  • In the terraform folder, create a new folder modules/db-backups , so your architecture should look like this :
terraform/
modules/
db-backups/
production/
main.tf
variables.tf
terraform.tfvars
  • In this db-backups folder, create a file main.tf :
resource "google_storage_bucket" "db-backups" {
project = var.project_id
name = "db-backups-${var.project_id}"
location = "EU"
storage_class = "MULTI_REGIONAL"
}
  • Again, we will need a file db-backups/variables.tf :
variable "project_id" {}
  • And a file db-backups/terraform.tfvars :
project_id = "your-project-id"
  • Our module is now ready to be used! Go back in the production/main.tf and add your module :
module "db_backups" {
source = "../modules/db-backups"
project_id = var.project_id
}
  • In your terminal, navigate to the production folder and run terraform apply (it first does the same as terraform plan and then ask you if you want to apply your changes). You should see that Terraform wants to add a new bucket. Check that nothing else is going to be done, and if so, write “yes” and apply your changes.
  • Go in GCP: you should see a new bucket db-backups in your project, it is empty for now but we are soon going to fill it!

4. Create a cloud function to store a compressed dump of your database in this bucket

4.1 Create a zip file containing the code of your cloud function

  • First of all, create a new folder inside modules/db-backups/ named backupCloudFunction
  • In there, create a new file index.js :
  • And also, a package.json file:
{
"name": "export-database",
"version": "0.0.1",
"dependencies": {
"googleapis": "^39.2.0",
"google-auth-library": "3.1.2"
}
}
  • Then, let’s create a zip with Terraform from those two files that we will call backup.zip:
data "archive_file" "backupZipFile" {
type = "zip"
output_path = "${path.module}/backupCloudFunction/backup.zip"
source {
content = "${file("${path.module}/backupCloudFunction/index.js")}"
filename = "index.js"
}
source {
content = "${file("${path.module}/backupCloudFunction/package.json")}"
filename = "package.json"
}
}
  • Run terraform apply and check that the backup.zip file has been created
  • We are now going to create a resource to store the code of our cloud function in the db-backups bucket. In the main.tf of db-backups, add a new resource :
resource "google_storage_bucket_object" "archive" {
name = "cloudFunctions/backup-${lower(replace(base64encode(data.archive_file.backupZipFile.output_md5), "=", ""))}.zip"
bucket = google_storage_bucket.db-backups.name
# Source path is relative to where the module is used : to improve
source = data.archive_file.backupZipFile.output_path
depends_on = [data.archive_file.backupZipFile]
}

You may be surprised by the name attribute of the archive but this is necessary: the cloud function to which we will give this archive will not detect changes made in this file unless its name is changed. This is why we chose a name for this file that will change only when its content has been modified.

  • Run terraform apply to create this archive. In the Storage menu of GCP, in the db-backups bucket, you should now see a folder cloudFunctions containing a zip file

4.2 Authorize your database to write in the db-backups bucket

  • Add a new resource in db-backups/main.tf :
resource "google_storage_bucket_iam_member" "editor" {
bucket = google_storage_bucket.db-backups.name
role = "roles/storage.objectCreator"
member = "serviceAccount:${var.service_account_email}"
}
  • Add a new variable in db-backups/variables.tf :
variable "service_account_email" {}
  • Then, in production/main.tf :
module "db_backups" {
source = "../modules/db-backups"
project_id = var.project_id
service_account_email = google_sql_database_instance.your-database.service_account_email_address
}
  • Run terraform apply

Your database is now allowed to create files in your db-backups bucket

4.3 Create the cloud function resource

  • Before doing anything, you need to enable the Cloud SQL Admin API or your cloud function won’t work
  • In db-backups/main.tf, add a new resource :
resource "google_cloudfunctions_function" "backupFunction" {
name = "backup"
description = "Performs a backup of the specified database in the db-backups bucket"
runtime = "nodejs10"
project = var.project_id
available_memory_mb = 128
source_archive_bucket = google_storage_bucket.db-backups.name
source_archive_object = google_storage_bucket_object.archive.name
trigger_http = true
entry_point = "backup"
environment_variables = {
PROJECT_ID = var.project_id,
DB_INSTANCE_NAME = var.db_instance_name,
BUCKET_NAME = google_storage_bucket.db-backups.name
DB_NAME_TO_EXPORT = var.db_name_to_export
}
}
  • As you can see, some more variables are needed : db_instance_name and db_name_to_export. Let’s add them in db-backups/variables.tf :
variable "app_db_instance_name" {}
variable "db_name_to_export" {}
  • Then, in production/main.tf :
module "db_backups" {
source = "../modules/db-backups"
project_id = var.project_id
service_account_email = google_sql_database_instance.your-database.service_account_email_address
app_db_instance_name = google_sql_database_instance.your-database.name
db_name_to_export = "name-of-the-database-to-export"
}
  • After running terraform apply, go in the Cloud Functions menu of GCP and you should see your cloud function named “backup” there
  • You can already test it (in the “Actions” column, select “Test function”, and then click on “Test the function” without adding anything in the “Triggering event”). Depending on the size of your database, the backup could take long. But ultimately, you should see a backup appears in your db-backups bucket!

4.4 Deal with the “Error waiting for Creating CloudFunctions Function”

When trying to create the cloud function with terraform apply, you may encounter this error :

[...] googleapi: Error 403: Your application has authenticated using end user credentials from the Google Cloud SDK or Google Cloud Shell which are not supported by the cloudfunctions.googleapis.com. We recommend configuring the billing/quota_project setting in gcloud or using a service account through the auth/impersonate_service_account setting. [...]

Terraform does not allow end-users to create cloud functions, only service accounts are authorized to do so. So here is a trick to be able to create your cloud function from your terminal anyway :

  • In db-backups/main.tf, create a service account that will have the right permissions (💡 if you already have one with the right permissions, you can skip this creation step and use it for the following steps)
resource "google_service_account" "cloudFunction" {
project = var.project_id
account_id = "cloud-function"
display_name = "cloud function"
description = "Used to create cloud functions"
}
resource "google_project_iam_member" "cloudFunctionEditor" {
project = var.project_id
role = "roles/editor"
member = "serviceAccount:${google_service_account.cloudFunction.email}"
}
resource "google_project_iam_member" "cloudFunctionStorageAdmin" {
project = var.project_id
role = "roles/storage.admin"
member = "serviceAccount:${google_service_account.cloudFunction.email}"
}
  • In the Service Accounts menu of GCP, create a key for this service account
  • If you have jq, you can use this command to get a well-formatted output of the key (if not, remove all spaces and line breaks except for those in BEGIN PRIVATE KEY and END PRIVATE KEY) :
cat <your_file_path>.json | jq --compact-output
  • Copy the output and then run the following command (don’t forget the ' around the google credentials) :
GOOGLE_CREDENTIALS='<paste_key_output_here>' terraform apply

5. Create a cloud scheduler to trigger a daily backup of your database

5.1 Configure an app engine

  • We now have a working cloud function to trigger a backup of our database. Let’s say we want to make a daily backup at 4 a.m in the french timezone.
  • First of all, enable the Cloud Scheduler API in your project
  • Then, to be able to use a cloud scheduler, your project must contain an App Engine. If you don’t, you need to create one.
  • In production/main.tf, add a new resource :
resource "google_app_engine_application" "app_engine" {
project = var.project_id
location_id = "europe-west"
}

⚠️ once an App Engine is created, you can not delete it or edit its location_id. The only way to do so would be to delete the entire project so define it carefully!

  • Run terraform plan : if you did not already have an app engine, Terraform will ask to create it. If you are sure about your configuration, do it. If you already did have one, Terraform will throw you an error saying that there is already an existing app engine.
  • (Do this step only if you already have an app engine) :
terraform import google_app_engine_application.app_engine your-project-id

5.2 Create the cloud scheduler

  • Add a new resource in db-backups/main.tf :
resource "google_cloud_scheduler_job" "backupJob" {
project = var.project_id
region = var.scheduler_region
name = "backup-job"
description = "Create a backup of the database"
schedule = "0 4 * * *"
time_zone = "Europe/Paris"
http_target {
http_method = "POST"
uri = google_cloudfunctions_function.backupFunction.https_trigger_url
}
}
  • A new variable scheduler_region is needed, because this value should be compatible with the location you set for your app engine. In db-backups/variables.tf, add :
variable "scheduler_region" {}
  • Then, in production/main.tf :
module "db_backups" {
source = "../modules/db-backups"
project_id = var.project_id
service_account_email = google_sql_database_instance.your-database.service_account_email_address
app_db_instance_name = google_sql_database_instance.your-database.name
scheduler_region = "europe-west1"
}

⚠️ in this example, the app engine has a location_id = "europe-west" that’s why I defined scheduler_region = "europe-west1". If your app engine has another location_id, find the compatible region to set your scheduler, or Terraform will throw an error

  • Run terraform apply
  • Go in the Cloud Scheduler menu of GCP: you should see your scheduler ready to work!
  • You can test it right away by clicking the “Test now” button: a backup should then appear in your db-backups bucket!

Congratulations! Daily backups are now scheduled for your database.

Feel free to share your comments or questions below!

Drug discovery is a challenging, intellectually complex, and rewarding endeavor: we help develop effective and safe cures to diseases affecting millions of people. If you’re looking to have a massive impact, join us! inato.com/careers

--

--