How to cruise through Database Migration Service on Google Cloud — the REST API way!

Sneha Choudhary
Google Cloud - Community
15 min readOct 17, 2022

Disruption caused by the advent of cloud computing and the digital transformation that it spurred, has compelled enterprises today to expedite their Journey to Cloud. The need for rapid application modernisation in order to sustain in the market, is certainly keeping every organisation on their toes. Database migration is an integral part of any application modernisation, since migrating applications to the Cloud — calls for migrating the accompanying data with it.

Google Cloud Database Migration Service

Google Cloud Database Migration Service (or DMS) simplifies migration of database workloads to Cloud SQL (Reliable, secure & cost-effective cloud database), at no additional base charge for native migrations*. Furthermore, it is fully managed and doesn’t require any operational or management overhead.

*Refer to Table #1 below for additional details on cost considerations.

Table #1

In this article, we will explore the end to end migration of PostgreSQL instance and its databases using the DMS REST API commands.

Note: It’s a long read … Wondering why I didn’t shorten this …

For the purpose of this write-up, the source PostgreSQL instance was set up on GCE VM. Reference link for explorers.

Let’s get started with the pre-requisites!

These are few configurations that must be applied to the source instance and the underlying databases, before it can be migrated via DMS:

  1. Installing the pglogical package on the source instance and modifying necessary configurations.
  2. Availability of a user with replication role (Referred to as “migration user” in this article).
  3. Installing pglogical EXTENSION on every database on the source instance and granting appropriate privileges to the migration user on these databases.

For detailed steps on source configuration, refer link.

Next up, enable the “Database Migration Service API” & “Cloud SQL Admin API” in the target project on Google Cloud Platform (This is a must for our cruise to sail). Optionally, there might be a need to enable “Service Networking API” as well, if we need VPC peering between target Cloud SQL and the VPC where the source database is located on Google Cloud. Enabling this API helps configure a Cloud SQL instance to use private IP. We will circle back to “Connectivity Methods” including VPC peering and other options a little later in the article..so we don’t have to fret about it for now!

Time to execute Database Migration Service API commands. Woo-hoo!!

Key considerations for the purpose of this post:

  • We will execute the API commands from Cloud Shell with cURL.
  • For authentication, we are using Application Default Credentials without saving them (gcloud auth application-default print-access-token). Feel free to use API keys or other preferred methods of authentication for invoking REST APIs.
  • For PUT/POST requests, we have to create a JSON file that will contain the data needed for API execution (Data file is a secure way to supply sensitive data).
  • The path to the data file (wherever applicable) will be provided via the command line, preceded by the @ character, as the --data parameter.
  • Since APIs are asynchronous, any object creation API command will return immediately, even if the operation is not complete. This is also applicable for other long running operations (LRO) like verification of migration job. In such cases, use the DMS API to manage operations and validate the completion of operation before moving to the next dependent step.

All set … here we go!

Quick summary of the major steps involved:

  1. Creation of source connection profile for PostgreSQL
  2. [Optional: Applicable for all LRO] Verification of the state of operation
  3. Creation of destination connection profile for Cloud SQL (Postgres)
  4. [Optional: Applicable only if VPC peering is needed] Source Instance & VPC configuration for peering
  5. Creation of migration job
  6. [Optional: Recommended] Verification of the migration job
  7. Starting the migration job
  8. [Manual/Automated] Verify the target Cloud SQL read replica for migration
  9. Promoting the migration job

1. Creation of source connection profile for PostgreSQL

Execute the following steps in sequence, to create a source connection profile.

A. Create a data file with .json extension and upload it to cloud shell: For creating a source connection profile, below is the minimum required information to be supplied as data file — without SSL (not recommended for production though) :

{
"name": "<name_for_the_sourceprofile>",
"displayName": "<display_name_for_the_profile>",
"postgresql": {
"host": "<host_machine_IP>",
"port": 5432,
"username": "<migration_username>",
"password": "<password_of_migration_user>"
}
}

*Replace the bold highlighted text appropriately. Also, note the default port used for PostgreSQL: 5432. Sample json below:

{
"name": "postgres-profile",
"displayName": "postgres profile",
"postgresql": {
"host": "0.1.2.3",
"port": 5432,
"username": "dbmig",
"password": "dbmigpwd"
}
}

B. Execute the curl command from cloud shell: Once the json created in step #A is uploaded to cloud shell, execute the below command to create a source connection profile in a desired GCP project and region:

curl --header "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
--header 'Content-Type: application/json' \
--data @<path-to-fileuploaded_in_stepA>/<filename> \
-X POST \
https://datamigration.googleapis.com/v1/projects/<target_gcp_project>/locations/<target_gcp_region>/connectionProfiles?connectionProfileId=<source_connection_profile_name_from_stepA>

*Replace the bold highlighted text appropriately. Sample command below:

curl --header "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
--header 'Content-Type: application/json' \
--data @./create_profile_postgres.json \
-X POST \
https://datamigration.googleapis.com/v1/projects/mydmsproject-sn/locations/us-central1/connectionProfiles?connectionProfileId=postgres-profile

C. Review the response of the command from cloud shell: The response will return an operationId (as this is a long running asynchronous operation), which can be used to validate the state of the create operation above, before creating the destination connection profile.

Response from cloud shell

D. Validate profile creation from GCP Console: Provided the operation completed successfully, creation of the source profile can be verified from GCP console as well (Another way to verify via manage operations is discussed in Step #2, which applies to any REST API that returns an operationID).

Source connection profile

2. [Optional: Applicable for all LRO] Verification of the state of operation

Execute the following steps in sequence, to fetch information about an operationId using GET method.

Note: There are other methods to LIST/CANCEL/DELETE an operation as well, not covered in the context of our discussion here.

A. Execute the curl command from cloud shell: Execute the below command to validate the state of the given operationId:

curl -X GET \-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
https://datamigration.googleapis.com/v1/projects/<same_as_target_gcp_project>/locations/<same_as_target_gcp_region>/operations/<operationId_for_any_LRO>

*Replace the bold highlighted text appropriately. Sample command below:

curl -X GET \-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \https://datamigration.googleapis.com/v1/projects/mydmsproject-sn/locations/us-central1/operations/operation-1663145358291-5e89f344a7205-94235201-f4fed53c

B. Review the response of the command from cloud shell while the operation is in progress: If the operation is in progress, the response will return the details with field “done”: false. Key fields to note in the response metadata section are “verb” (name of the operation — verify, create, promote etc.) and “target” (resource identifier).

Response from cloud shell

C. Review the response of the command from cloud shell on operation completion: The response will return the details of the completed operation with field “done”: true and any one of the appropriate result field: “error” (in case of failed execution) or “response” (in case of successful execution). Key fields to note in the response metadata section are “verb” (name of the operation — verify, create, promote etc.) and “target” (resource identifier).

Response from cloud shell for a failed operation(migration job verification)
Response from cloud shell for a successful operation(connectionProfile creation)

3. Creation of destination connection profile for Cloud SQL (Postgres)

Destination profile creates an external primary connection profile (identified with suffix: “_master”) along with a read replica (database version supplied during creation). Actual data migration happens to the read replica, which is then promoted to primary. External master is just a placeholder that links to the source database and might have a different database version - which should be totally fine!

Execute the following steps in sequence, to create a destination connection profile.

A. Create a data file with .json extension and upload it to cloud shell: For creating a destination connection profile, create a data file with the below information:

{ "name":"<name_for_the_targetprofile>",
"displayName": "<display_name_for_the_profile>",
"cloudsql": {
"settings": {
"databaseVersion": "<target_database_version>",
"tier": "<machine_type>",
"activationPolicy": "<ALWAYS_or_NEVER>",
"autoStorageIncrease": <true_or_false>,
"zone": "<target_gcp_zone_in_project_region>",
"sourceId": "<source_connection_profile_identifier_fromStep1>",
"rootPassword":"<setup_root_password_here>"
}
}
}

*Replace the bold highlighted text appropriately. Sample json below for an always active connection (Set “activationPolicy”: “NEVER”, if the instance needs to be OFF irrespective of incoming connection requests) :

{ "name":"new2-target-postgres-profile",
"displayName": "new2 target postgres profile",
"cloudsql": {
"settings": {
"databaseVersion": "POSTGRES_12",
"tier": "db-custom-1-3840",
"activationPolicy": "ALWAYS",
"autoStorageIncrease": false,
"zone": "us-central1-b",
"sourceId": "projects/mydmsproject-sn/locations/us-central1/connectionProfiles/postgres-profile",
"rootPassword":"mytargetpwd"
}
}
}

B. Execute the curl command from cloud shell: Once the json created in step #A is uploaded to cloud shell, execute the below command to create a destination connection profile in a desired GCP project and region (same command as Step 1. B):

curl --header "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
--header 'Content-Type: application/json' \
--data @<path-to-fileuploaded_in_StepA>/<filename> \
-X POST \
https://datamigration.googleapis.com/v1/projects/<target_gcp_project>/locations/<target_gcp_region>/connectionProfiles?connectionProfileId=<destination_connection_profile_name_from_stepA>

*Replace the bold highlighted text appropriately. Sample command below:

curl --header "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
--header 'Content-Type: application/json' \
--data @./create_profile_target_postgres.json \
-X POST \
https://datamigration.googleapis.com/v1/projects/mydmsproject-sn/locations/us-central1/connectionProfiles?connectionProfileId=new2-target-postgres-profile

C. Review the response of the command from cloud shell: Response will return the operationId (similar to Step 1.C), which can be validated for completion before moving to the next step.

D. Validate profile creation from GCP Console: On successful completion, creation of the destination Cloud SQL (master and a read replica) can be verified from GCP console as well. However, destination profile unlike source profile, does not show up on the GCP Console. GET/LIST methods for connectionProfiles can be used to retrieve information about them.

Destination Cloud SQL instance

We are one-third through the process …

Keep Going!

4. [Optional: Applicable only if VPC peering is needed] Source Instance & VPC configuration for peering

To ensure successful connection between source and target database instances, the following configuration changes must be applied before creating a migration job:

  1. Configure private service access: Assuming “Service Networking API” is already enabled, navigate to the source VPC on the VPC network page and edit the same to a). Allocate an IP range and b). Create a private connection to the allocated range. If this step is omitted, migration job won’t be created.
  2. Configure firewall rule on source VPC to allow connection from destination CloudSQL: Add a firewall rule to the source VPC to allow the Outgoing IP address of the destination CloudSQL on port 5432 (port configured during source profile creation in Step 1).
  3. Edit source instance configuration file (pg_hba.conf): SSH to the GCE VM where the source PostgreSQL is installed and edit the pg_hba.conf file to allow Outgoing IP address of the destination CloudSQL.

Note: If step# 2 and #3 above aren’t performed, migration job will still get created, however, it will fail at verify/start stage with error: ”Failure connecting to the source database. Make sure the connectivity information on the connection profile is correct and the source database is reachable. Address the issues then Start the migration job.”

5. Creation of migration job

Let us quickly dive into the possible methods for setting up connectivity between source and target database instances.

Connectivity methods

While creating a migration job, following 3 connectivity methods are supported to establish a connection between source and destination databases:

1. VpcPeeringConnectivity: This option requires details of the VPC where the source database is located in Google Cloud. This is used to set up a VPC peering connection between Cloud SQL and the supplied VPC.

2. ReverseSshConnectivity*: This option is used to create a reverse SSH tunnel between the source and destination databases. This approach depends on a Bastion server for setting up the reverse SSH tunnel, which also eventually sets-up VPC peering between the Cloud SQL private network and the VPC.

3. StaticIpConnectivity**: This option is used when source database is expected to allow incoming connections from the destination database’s public IP.

*Additional details on creating a migration job with this option is not covered in this article.

**This is the default and no additional configuration is required.

Now that we understand about connectivity methods, let’s execute the following steps in sequence, to create a migration job with VpcPeeringConnectivity.

A. Create a data file with .json extension and upload it to cloud shell: For creating a migration job, below is the minimum required information to be supplied as data file — for VpcPeeringConnectivity (for StaticIpConnectivity, simply omit the field: “vpcPeeringConnectivity”):

Note: “type”: ONE_TIME is not supported for PostgreSQL, at the time of writing this.

{
"name":"<name_for_the_migrationjob>",
"displayName":"<display_name_for_the_job>",
"type":"CONTINUOUS",
"source":"<source_connection_profile_identifier_fromstep1>",
"destination":"<destination_connection_profile_identifier_fromstep3>",
"sourceDatabase":{
"provider":"DATABASE_PROVIDER_UNSPECIFIED",
"engine":"POSTGRESQL"
},
"destinationDatabase":{
"provider":"CLOUDSQL",
"engine":"POSTGRESQL"
},
"vpcPeeringConnectivity":{
"vpc":"<name_of_vpc_containing_source_database_instance>"
}
}

*Replace the bold highlighted text appropriately. Sample json below:

{
"name":"postgres-migration-job",
"displayName":"postgres migration job",
"type":"CONTINUOUS",
"source":"projects/mydmsproject-sn/locations/us-central1/connectionProfiles/postgres-profile",
"destination":"projects/mydmsproject-sn/locations/us-central1/connectionProfiles/new2-target-postgres-profile",
"sourceDatabase":{
"provider":"DATABASE_PROVIDER_UNSPECIFIED",
"engine":"POSTGRESQL"
},
"destinationDatabase":{
"provider":"CLOUDSQL",
"engine":"POSTGRESQL"
},
"vpcPeeringConnectivity":{
"vpc":"mydmsproject-sn-vpc-source"
}
}

B. Execute the curl command from cloud shell: Once the json created in step #A is uploaded to cloud shell, execute the below command to create a migration job in a desired GCP project and region:

curl --header "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
--header 'Content-Type: application/json' \
--data @./<path-to-fileuploaded_in_StepA>/<filename> \
-X POST \
https://datamigration.googleapis.com/v1/projects/<same_as_target_gcp_project>/locations/<same_as_target_gcp_region>/migrationJobs?migrationJobId=<id_of_the_migration_job>&requestId=<unique_id_for_request>

*Replace the bold highlighted text appropriately. Sample command below:

curl --header "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
--header 'Content-Type: application/json' \
--data @./create_migjob_postgres.json \
-X POST \
https://datamigration.googleapis.com/v1/projects/mydmsproject-sn/locations/us-central1/migrationJobs?migrationJobId=postgres-migjob&requestId=mypostgresjob1

C. Review the response of the command from cloud shell: return the operationId (similar to Step 1.C), except that the target field will be the path of the migration job.

D. Validate job creation from GCP Console: Creation of the migration job can be verified from GCP console too.

Note: If “Service Networking API” is not enabled at this stage (required for VPC peering), job creation will fail with error: “Failed to create subnetwork. Consumer <projectId> should enable service:Service Networking before generate default identity”. In case API is enabled but private service access is not configured, job can fail with error: “Failed to create subnetwork. Please create Service Networking connection with service ‘servicenetworking.googleapis.com’ from consumer <projectId> network <source_vpc_network> again”.

Quick view of the Migration job

6. [Optional: Recommended] Verification of the migration job

Execute the following steps in sequence, to verify a migration job. This step does all the pre-validation to ensure that the migration job is set up correctly and will start and run successfully. It is a good practice to verify the migration job, before starting the same.

A. Execute the curl command from cloud shell: Once the migration job is created successfully, execute the below command to verify that migration job:

curl --header "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-X POST
https://datamigration.googleapis.com/v1/projects/<same_as_target_gcp_project>/locations/<same_as_target_gcp_region>/migrationJobs/<id_of_the_migration_job_created_in_step5>:verify

*Replace the bold highlighted text appropriately. Sample command below:

curl --header "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-X POST https://datamigration.googleapis.com/v1/projects/mydmsproject-sn/locations/us-central1/migrationJobs/postgres-migjob:verify

B. Review the response of the command from cloud shell: Response will be similar to either Step 2.B or Step 2.C, depending on whether the operation is in progress or has completed. Steps #7 through #9 should not be executed, unless the response returns a json with field “done”: true, along with the presence of the result field named “response” — indicating successful execution.

C. Validate job verification status from GCP Console: Once the long running operation completes successfully, status of the migration job can be verified from GCP console as well, where post verify if the job is in errors, it will show up on the console.

Sample view of a failed verification for a migration job
Sample view of a successful verification for a migration job

7. Starting the migration job

At the start of the migration job, an initial snapshot of the source database is taken and the dump file is loaded into the destination. After the initial load, CDC is continuously processed until a decision is made to switch the ongoing traffic to the target and promote the read replica to primary, which we will cover in Step #8.

Now that we have verified our migration job to be free of any errors, let us go ahead and execute the following steps to start a migration job.

Note: If the migration job is not set up correctly (can be due to missing extension at source, appropriate APIs not being enables etc.), starting a migration job can lead to errors. Hence, it is best advised to verify a migration job before starting.

A. Execute the curl command from cloud shell: Once the migration job is verified successfully, execute the below command to start the same (syntax similar to verify command in Step #6):

curl --header "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-X POST https://datamigration.googleapis.com/v1/projects/mydmsproject-sn/locations/us-central1/migrationJobs/postgres-migjob:start

B. Review the response of the command from cloud shell: Response will be similar to either Step 2.B or Step 2.C, depending on whether the operation is in progress or has completed.

C. Validate job execution status from GCP Console: Provided the job got started without any errors, status of the migration job can be verified from GCP console as well.

Sample view of a CONTINUOUS migration job that is getting started
Sample view of a CONTINUOUS migration job that is running successfully
Sample view of a CONTINUOUS migration job that failed (This is where verification step comes in handy)

8. [Manual/Automated] Verify the target Cloud SQL read replica for migration

It is advised to verify all migrated databases to ensure the migration was successful, before promoting the target Cloud SQL read replica to primary.

Below are the steps that were performed as part of this post to verify the target SQL read replica — for its promotion readiness.

A. Verify target databases created via 1.GCP Console or 2. Cloud Shell using cURL commands:

1. Navigate to the SQL instances page on the GCP Console and select open the target read replica

A. Cloud SQL target databases

2. Execute the Cloud SQL API command via curl from cloud shell and validate the response

curl -X GET \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
https://sqladmin.googleapis.com/v1/projects/<same_as_target_gcp_project>/instances/<name_for_the_targetprofile_created_in_step3>/databases

*Replace the bold highlighted text appropriately. Sample command below:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
https://sqladmin.googleapis.com/v1/projects/mydmsproject-sn/instances/new2-target-postgres-profile/databases
Response from cloud shell

B. Login to the target databases and verify the data: Verify that the correct information is migrated across all the target databases. The data across tables should match that in the source tables. Below screenshot shows the data one of the tables (without primary key) that was migrated accurately, along with the query run on information_schema to confirm that the target table was created as-is in the source (i.e. without primary key).

Data in the target database on Cloud SQL

9. Promoting the migration job

Promoting a migration job disconnects the destination Cloud SQL instance from the source and converts the read replica instance to a primary instance (read/write mode). Now that we have verified our migration target to be free of any errors, let us go ahead and execute the following steps to promote the Cloud SQL read replica to primary instance.

A. Execute the curl command from cloud shell: Once the destination is verified for successful migration (and/or a decision is made to switch traffic to the destination Cloud SQL instance), execute the below sample command to promote the read replica to primary (syntax similar to verify command in Step #6):

curl --header "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-X POST https://datamigration.googleapis.com/v1/projects/mydmsproject-sn/locations/us-central1/migrationJobs/postgres-migjob:promote

B. Review the response of the command from cloud shell: Response will be similar to either Step 2.B or Step 2.C, depending on whether the operation is in progress or has completed.

C. Validate job execution status from GCP Console: While the promotion is in progress (downtime), the status of the job will be “Promote in progress” . Once the migration job completes processing of outstanding changes without any errors, status of the migration job changes to COMPLETED and can be verified from GCP console .

D. Validate the status of Cloud SQL instance from GCP Console: On the GCP console, verify that the link between the external master and the primary Cloud SQL instance (former read replica) does not exist any further.

Bonus Section with some good to know commands and reference links, that can come in handy anytime!

Reference Links

  1. PostgreSQL migration with DMS
  2. QuickStart of postgres database migration
  3. Database migration service REST APIs
  4. Debug issues with migration jobs
  5. SSL/TLS certificate setup (not used as part of this post)
  6. Set up authentication for invoking REST APIs

Should you be interested in learning more, you can find another one of my Medium posts on the topic “GCP’s DMS REST API using Python” here 😎.

Happy Learning!

--

--