How to use Terraform to schedule backups for your Google Cloud SQL database
You may be thinking “but Google SQL database instances already provide automated and on-demand backups, why would you need to use terraform to deal with it?”. Well, the issue is that if your database instance is deleted, for instance after a manual mistake or a Google Cloud SQL outage, so are your backups!
This is why we decided to save those backups in a google storage bucket, and you will soon discover that this process is really easy to do with Terraform.
Prerequisites :
- Have a project in Google Cloud Platform (GCP)
- Have a Cloud SQL database defined in GCP
- Install Terraform CLI on your computer (see documentation here, or simply run
brew install terraform
if you are a mac user) - Install gcloud CLI
Here are the steps we are going to follow in this article :
1. Setup the terraform config for your Google Cloud project
- Run
gcloud init
andgcloud auth application-default login
to allow terraform to benefit from your access rights when you run terraform commands locally - From your console, create a new bucket (its name has to be unique among all the buckets that exist in GCP so we suffix it with the project id for this purpose) :
PROJECT_ID="your-project-id"
BUCKET_NAME="gs://terraform-state-$PROJECT_ID"
gsutil mb -p $PROJECT_ID $BUCKET_NAME
gsutil versioning set on $BUCKET_NAME
- Go in GCP and check that a bucket
terraform-state-your-project-id
has been created - In your project repository, create a folder
terraform
with the following architecture :
terraform/
production/ // I named it after the name of a specific environment
// but you can name it as you like
- Create a file
production/main.tf
:
terraform {
backend "gcs" {
bucket = "terraform-state-your-project-id"
prefix = "terraform/state"
}
}provider "google" {
region = "europe-west1"
zone = "europe-west1-d"
}
You may be wondering “but what is this terraform state”? This state is a sort of screenshot of your resources configuration that you are about to link with Terraform. It will use this state to make changes to your infrastructure.
An interesting advantage of storing this state in a bucket is that everyone will share the same configuration. Moreover, if you are editing the state, Terraform will lock it so that no one else would be able to edit it at the same time.
- Navigate in your console into your
production
terraform folder and runterraform init
- Go in GCP and check that the bucket has a file
terraform.tfstate
which contains the terraform state
2. Import your database into Terraform
Now that Terraform is ready, we can add your database in its state.
- Create a
production/variables.tf
file :
variable "project_id" {}
- Create a
production/terraform.tfvars
file :
project_id = "your-project-id"
- In
production/main.tf
, add the database resource config :
resource "google_sql_database_instance" "your-database" {
project = var.project_id
name = "your-database"
database_version = "POSTGRES_9_6"
region = "europe-west1" settings {
tier = "db-custom-16-106496"
}
}
Here, I specified all the required fields of google_sql_database_instance
(see documentation) but also the project
field, as it is a good practice to be sure that your resource is going to be linked to the right project.
If you don’t know if these values are adapted to your database, don’t worry, Terraform won’t change the properties of your database without your consent!
- In your terminal, navigate to the folder containing your files and import your database config into Terraform:
terraform import google_sql_database_instance.your-database projects/your-project-id/instances/your-sql-database-instance-name
This command will tell Terraform “the resource called google_sql_database_instance.your-database
that I have defined in main.tf
is linked to the real resource that you can find here: projects/your-project-id/instances/your-sql-database-instance-name
”.
- Let’s run
terraform plan
to check if there are any differences between the configuration we defined and your real database configuration - If there are, edit your configuration in
production/main.tf
. For instance, if the region of your database is in fact “us-east1” instead of “europe-west1”, change the value of theregion
field in your configuration. - Repeat the two previous steps until the command
terraform plan
displays: “No changes. Infrastructure is up-to-date.”
3. Create a GCP bucket to store your backups
- In the
terraform
folder, create a new foldermodules/db-backups
, so your architecture should look like this :
terraform/
modules/
db-backups/
production/
main.tf
variables.tf
terraform.tfvars
- In this
db-backups
folder, create a filemain.tf
:
resource "google_storage_bucket" "db-backups" {
project = var.project_id
name = "db-backups-${var.project_id}"
location = "EU"
storage_class = "MULTI_REGIONAL"
}
- Again, we will need a file
db-backups/variables.tf
:
variable "project_id" {}
- And a file
db-backups/terraform.tfvars
:
project_id = "your-project-id"
- Our module is now ready to be used! Go back in the
production/main.tf
and add your module :
module "db_backups" {
source = "../modules/db-backups"
project_id = var.project_id
}
- In your terminal, navigate to the production folder and run
terraform apply
(it first does the same asterraform plan
and then ask you if you want to apply your changes). You should see that Terraform wants to add a new bucket. Check that nothing else is going to be done, and if so, write “yes” and apply your changes. - Go in GCP: you should see a new bucket
db-backups
in your project, it is empty for now but we are soon going to fill it!
4. Create a cloud function to store a compressed dump of your database in this bucket
4.1 Create a zip file containing the code of your cloud function
- First of all, create a new folder inside
modules/db-backups/
namedbackupCloudFunction
- In there, create a new file
index.js
:
- And also, a
package.json
file:
{
"name": "export-database",
"version": "0.0.1",
"dependencies": {
"googleapis": "^39.2.0",
"google-auth-library": "3.1.2"
}
}
- Then, let’s create a zip with Terraform from those two files that we will call
backup.zip
:
data "archive_file" "backupZipFile" {
type = "zip"
output_path = "${path.module}/backupCloudFunction/backup.zip"source {
content = "${file("${path.module}/backupCloudFunction/index.js")}"
filename = "index.js"
}source {
content = "${file("${path.module}/backupCloudFunction/package.json")}"
filename = "package.json"
}
}
- Run
terraform apply
and check that thebackup.zip
file has been created - We are now going to create a resource to store the code of our cloud function in the
db-backups
bucket. In themain.tf
ofdb-backups
, add a new resource :
resource "google_storage_bucket_object" "archive" {
name = "cloudFunctions/backup-${lower(replace(base64encode(data.archive_file.backupZipFile.output_md5), "=", ""))}.zip"
bucket = google_storage_bucket.db-backups.name
# Source path is relative to where the module is used : to improve
source = data.archive_file.backupZipFile.output_path
depends_on = [data.archive_file.backupZipFile]
}
You may be surprised by the name attribute of the archive but this is necessary: the cloud function to which we will give this archive will not detect changes made in this file unless its name is changed. This is why we chose a name for this file that will change only when its content has been modified.
- Run
terraform apply
to create this archive. In the Storage menu of GCP, in thedb-backups
bucket, you should now see a foldercloudFunctions
containing a zip file
4.2 Authorize your database to write in the db-backups bucket
- Add a new resource in
db-backups/main.tf
:
resource "google_storage_bucket_iam_member" "editor" {
bucket = google_storage_bucket.db-backups.name
role = "roles/storage.objectCreator"
member = "serviceAccount:${var.service_account_email}"
}
- Add a new variable in
db-backups/variables.tf
:
variable "service_account_email" {}
- Then, in
production/main.tf
:
module "db_backups" {
source = "../modules/db-backups"
project_id = var.project_id
service_account_email = google_sql_database_instance.your-database.service_account_email_address
}
- Run
terraform apply
Your database is now allowed to create files in your db-backups
bucket
4.3 Create the cloud function resource
- Before doing anything, you need to enable the Cloud SQL Admin API or your cloud function won’t work
- In
db-backups/main.tf
, add a new resource :
resource "google_cloudfunctions_function" "backupFunction" {
name = "backup"
description = "Performs a backup of the specified database in the db-backups bucket"
runtime = "nodejs10"
project = var.project_id available_memory_mb = 128
source_archive_bucket = google_storage_bucket.db-backups.name
source_archive_object = google_storage_bucket_object.archive.name
trigger_http = true
entry_point = "backup"environment_variables = {
PROJECT_ID = var.project_id,
DB_INSTANCE_NAME = var.db_instance_name,
BUCKET_NAME = google_storage_bucket.db-backups.name
DB_NAME_TO_EXPORT = var.db_name_to_export
}
}
- As you can see, some more variables are needed :
db_instance_name
anddb_name_to_export
. Let’s add them indb-backups/variables.tf
:
variable "app_db_instance_name" {}
variable "db_name_to_export" {}
- Then, in
production/main.tf
:
module "db_backups" {
source = "../modules/db-backups"
project_id = var.project_id
service_account_email = google_sql_database_instance.your-database.service_account_email_address
app_db_instance_name = google_sql_database_instance.your-database.name
db_name_to_export = "name-of-the-database-to-export"
}
- After running
terraform apply
, go in the Cloud Functions menu of GCP and you should see your cloud function named “backup” there - You can already test it (in the “Actions” column, select “Test function”, and then click on “Test the function” without adding anything in the “Triggering event”). Depending on the size of your database, the backup could take long. But ultimately, you should see a backup appears in your
db-backups
bucket!
4.4 Deal with the “Error waiting for Creating CloudFunctions Function”
When trying to create the cloud function with terraform apply
, you may encounter this error :
[...] googleapi: Error 403: Your application has authenticated using end user credentials from the Google Cloud SDK or Google Cloud Shell which are not supported by the cloudfunctions.googleapis.com. We recommend configuring the billing/quota_project setting in gcloud or using a service account through the auth/impersonate_service_account setting. [...]
Terraform does not allow end-users to create cloud functions, only service accounts are authorized to do so. So here is a trick to be able to create your cloud function from your terminal anyway :
- In
db-backups/main.tf
, create a service account that will have the right permissions (💡 if you already have one with the right permissions, you can skip this creation step and use it for the following steps)
resource "google_service_account" "cloudFunction" {
project = var.project_id
account_id = "cloud-function"
display_name = "cloud function"
description = "Used to create cloud functions"
}resource "google_project_iam_member" "cloudFunctionEditor" {
project = var.project_id
role = "roles/editor"
member = "serviceAccount:${google_service_account.cloudFunction.email}"
}
resource "google_project_iam_member" "cloudFunctionStorageAdmin" {
project = var.project_id
role = "roles/storage.admin"
member = "serviceAccount:${google_service_account.cloudFunction.email}"
}
- In the Service Accounts menu of GCP, create a key for this service account
- If you have
jq
, you can use this command to get a well-formatted output of the key (if not, remove all spaces and line breaks except for those inBEGIN PRIVATE KEY
andEND PRIVATE KEY
) :
cat <your_file_path>.json | jq --compact-output
- Copy the output and then run the following command (don’t forget the
'
around the google credentials) :
GOOGLE_CREDENTIALS='<paste_key_output_here>' terraform apply
5. Create a cloud scheduler to trigger a daily backup of your database
5.1 Configure an app engine
- We now have a working cloud function to trigger a backup of our database. Let’s say we want to make a daily backup at 4 a.m in the french timezone.
- First of all, enable the Cloud Scheduler API in your project
- Then, to be able to use a cloud scheduler, your project must contain an App Engine. If you don’t, you need to create one.
- In
production/main.tf
, add a new resource :
resource "google_app_engine_application" "app_engine" {
project = var.project_id
location_id = "europe-west"
}
⚠️ once an App Engine is created, you can not delete it or edit its location_id
. The only way to do so would be to delete the entire project so define it carefully!
- Run
terraform plan
: if you did not already have an app engine, Terraform will ask to create it. If you are sure about your configuration, do it. If you already did have one, Terraform will throw you an error saying that there is already an existing app engine. - (Do this step only if you already have an app engine) :
terraform import google_app_engine_application.app_engine your-project-id
5.2 Create the cloud scheduler
- Add a new resource in
db-backups/main.tf
:
resource "google_cloud_scheduler_job" "backupJob" {
project = var.project_id
region = var.scheduler_region
name = "backup-job"
description = "Create a backup of the database"
schedule = "0 4 * * *"
time_zone = "Europe/Paris"http_target {
http_method = "POST"
uri = google_cloudfunctions_function.backupFunction.https_trigger_url
}
}
- A new variable
scheduler_region
is needed, because this value should be compatible with the location you set for your app engine. Indb-backups/variables.tf
, add :
variable "scheduler_region" {}
- Then, in
production/main.tf
:
module "db_backups" {
source = "../modules/db-backups"
project_id = var.project_id
service_account_email = google_sql_database_instance.your-database.service_account_email_address
app_db_instance_name = google_sql_database_instance.your-database.name
scheduler_region = "europe-west1"
}
⚠️ in this example, the app engine has a location_id = "europe-west"
that’s why I defined scheduler_region = "europe-west1"
. If your app engine has another location_id
, find the compatible region to set your scheduler, or Terraform will throw an error
- Run
terraform apply
- Go in the Cloud Scheduler menu of GCP: you should see your scheduler ready to work!
- You can test it right away by clicking the “Test now” button: a backup should then appear in your
db-backups
bucket!
Congratulations! Daily backups are now scheduled for your database.
Feel free to share your comments or questions below!
Drug discovery is a challenging, intellectually complex, and rewarding endeavor: we help develop effective and safe cures to diseases affecting millions of people. If you’re looking to have a massive impact, join us! inato.com/careers