Cloud SQL with private IP only: the Good, the Bad and the Ugly

Published in

Google Cloud - Community

10 min readJan 5, 2021

The data are the new goldmine of all companies, and this treasure must be kept secure and protected. That’s why, for many years, a common good practice of any database administrator is to remove all public access to the database, especially the public IP, and to grant only access from the private IP.
This “golden” rule is enforced by all security teams and they requires the same pattern for any cloud deployment.

Cloud SQL service, the managed database service on Google Cloud, allows you to:

Set a private IP on your instance and to connect it directly to the VPC of your choice
Remove the public IP from your instance

Thanks to these features, you can enforce the security team requirements.

But, is it a problem when we work day-to-day to not have public IP on Cloud SQL instances?

Let’s check this over 3 use cases:

Compute Engine connectivity
Serverless services connectivity
Local environment connectivity

Cloud SQL proxy binary

Before going deeper into the use cases, I would like to perform a quick focus on the main feature of Cloud SQL proxy

This binary opens a secure and end-to-end encrypted tunnel. In summary, even if your database don’t have SSL certificate, the data are encrypted in transit.
Before opening the tunnel, the binary checks against the IAM service API if the current credential is authorized to access the Cloud SQL instance. This is an additional layer of security, in addition to the standard database authentication by user/password.
The tunnel can be open on a local port (TCP connection mode) or to a Unix socket (not possible on Windows environment)

The Good: Compute Engine connectivity

Because this “private IP security pattern” has been built for legacy architecture (i.e. on-prem VM and private network), the constraint perfectly fits the IaaS world (i.e. Compute Engine + VPC).
I mention Compute Engine but it’s also true with all the services that use Compute Engine instances: Dataflow, Dataproc, GKE,…

Deploy your VM in the same VPC as the Cloud SQL instances private IP.
Use the Cloud SQL instance private IP to reach it from your app which is deployed on Compute Engine.
Open the required firewall rules if needed.

That’s “good”!!

Note: the usage of Cloud SQL proxy in this case can be discussed. It adds an additional security layer by checking the authorization against IAM services and the encrypted communication, but also add additional latency and potential point of failure. It’s matter of tradeoffs.

The Bad: Serverless services connectivity

When we are using the “new world” of the Cloud, the serverless paradigm, the things aren’t so nice with Cloud Run, Cloud Functions and App Engine.

Indeed, when you deploy on serverless platforms, by definition, you don’t have to manage the servers, and thus, the servers aren’t in your VPC. And thus, it’s not possible, out of the box, to reach the Cloud SQL private IP connected to your project’s VPC.

If you dig into Cloud Run, Cloud Functions and App Engine, you can find how to connect your Cloud SQL instance to your serverless service thanks to a built-in feature:

Add the Cloud SQL instance connection name to your configuration and automatically a tunnel is created with the instance start.

In reality, it’s a connection opened by Cloud SQL proxy binary in Unix socket mode. But this solution works only if the Cloud SQL instance has a public IP.
In case of private IP only, and even if a private IP connection mode exist with Cloud SQL proxy, it doesn’t work!

Connect the serverless services to the Cloud SQL private IP

Hopefully a solution exists. You can click on the private IP tab in the Cloud SQL connection tutorials for this (example for Cloud Functions).

In summary, you need to:

Create a serverless VPC connector, in the same region as your serverless service. And, of course, connected to the same VPC as your Cloud SQL instance.
Note that today, all the regions are supported by serverless VPC connector but it wasn’t the case for a while.
Deploy your serverless service with this VPC connector, supported by Cloud Run, Cloud Functions and App Engine
In your app, use the Cloud SQL private IP (instead of the Unix socket connection)

This solution works but introduces a new network element, an additional cost and a possible new point of failure in the connectivity chain.

That’s “bad”!!

The Ugly: Local environment connectivity

When you want to query your database, sometime in production, through your preferred database IDE installed on your computer, you need to connect your local environment to the Cloud SQL instance, and thus via the private IP.

Because the Cloud SQL instance doesn’t have a public IP, you can’t reach it directly from the internet (your computer), and Cloud SQL proxy also can’t. The solution here is to create a bastion host: a bridge VM between the outside world (public IP) and the inside world (private IP).

For this, create a small Compute Engine instance, a f1-micro for example (very affordable, less than $5 per month), to achieve this.

Security team requirement: All the VM mustn’t have a public IP. Of course, 0.0.0.0/0 firewall rules aren’t allowed, especially on port 22 (ssh)

Bastion host without public IP

New constraints to deal with! But it’s consistent: if you close the door, it’s to for letting to window open!!
So no problem, Google Cloud offers a great and easy solution for this: Identity-Aware Proxy (IAP)

With IAP, you only have to grant a Google Cloud IAP IP range 35.235.240.0/20 to access to your Compute Engine on port 22 in your firewall rules. And thus, it’s not required to open 0.0.0.0/0 (the whole internet) to reach the Compute Engine ssh port!

Then, use the gcloud SDK to connect to your bastion host Compute Engine instance
gcloud compute ssh <Instance Name> --zone=<Your Zone>
And the magic happens! The gcloud SDK detects automatically the lack of public IP of the Compute Engine instance and opens automatically an IAP tunnel to connect in SSH the instance. It’s invisible for the user!

The credential (here the user account, but it could be service account) needs to have enough permission to create an IAP tunnel. The roles are roles/iap.tunnelResourceAccessor to create the tunnel, and roles/compute.instanceAdmin.v1 to acces to the VM

Now, you are connected to the bastion host, but you haven’t yet connected the Cloud SQL instance…

IAP tunneling, SSL port forwarding and Cloud SQL proxy

And there, the “ugly” part starts. We have 2 things to achieve

Connect the bastion host to the Cloud SQL instance via the private IP
Forward the local environment database connection request to the bastion host to reach the Cloud SQL instance through Cloud SQL Proxy tunnel

1. Cloud SQL instance connectivity from bastion host
For this, we need the Cloud SQL proxy to open a tunnel between the bastion host and the Cloud SQL instance.

First issue: Because you haven’t a public IP, you can’t reach the internet!
You can choose to create a Cloud NAT on your bastion host Compute Instance IP range. But, the easiest way is to download it in your local environment and then to copy it on the bastion host. To achieve this, you can use this command.
gcloud compute scp /local/path/to/cloud_sql_proxy <instanceName>:/tmp
Because you can open a SSH connection through IAP, you can also use scp protocol (ssh copy) to copy files through IAP! Magic!!

Great, now you have the binary on the bastion host Compute Engine instance, and you want to test if the connexion works. You can run the commands

#Connect to bastion host
gcloud compute ssh <instance Name> --zone=<Your zone>
#Change the permission of the Cloud SQL proxy binary (do it only once)
chmod +x /tmp/cloud_sql_proxy
#Connect to your Cloud SQL instance
/tmp/cloud_sql_proxy --instances=<connection_name>=tcp:3306

You can find the connection_name in the page of your Cloud SQL Instance

Second issue: it doesn’t work! And there are many reasons!

If you created your bastion host Compute Engine with the default parameters, especially the API scope part, you haven’t enough scope to reach the Cloud SQL APIs.
To solve this, you need to stop your Compute Engine, edit it and customize the scopes

When the Cloud SQL tunnel is created, the IAM permissions of the current credentials (here the Compute Engine credentials) are checked. The Compute Engine service account could not have enough permissions.
To solve this, be sure that the minimal roles are granted on the service account: Cloud SQL client, Cloud SQL editor or Cloud SQL admin
Compute Engine default service account has roles/editor role. Too broad but enough for our use case.
Finally, by default, when you request a Google Cloud API, the public DNS is requested. It occurs when Cloud SQL proxy requests the Cloud SQL API or the IAM service when the tunnel is created. And, because the bastion host Compute Engine doesn’t have a public IP, it can reach the internet and the googleapis.com public DNS.
To solve this, you have 2 solutions. Either set up a Cloud NAT as proposed before, or use a tricky feature of Google Cloud VPC: Allow the bastion host Compute Engine’s current subnet to call the private googleapis.com DNS. To achieve this, go to your VPC, select the correct subnet and edit it. Then select On for the Private Google Access radio button, and save.

Great!! Now the bastion host Compute Engine can use Cloud SQL proxy to open a tunnel from the local port 3306 to the Cloud SQL instance, through the private IP.

2. Forward local environment traffic to Cloud SQL instance
To achieve this, we won’t use the gcloud ssh built-in feature but an alternative solution. In addition to a direct connection with ssh to Compute Instances, gcloud SDK allows to create a tunnel on any ports.

So, let’s create a tunnel on port 22 of the bastion host Compute Engine instance and define an arbitrary local port (here 4226)

gcloud compute start-iap-tunnel <instance Name> 22 \
  --zone=<Your zone> --local-host-port=localhost:4226

Great, a tunnel is open and we can to use it to connect the bastion host Compute Engine instance in ssh.

Let the tunnel opened and running in a terminal and open a new one.

Now, let’s connect in ssh to it. In one command, you will achieve several mandatory things to establish the connection:

Create a port forwarding from your local environment to the bastion host compute engine
-L 3306:localhost:3306
“my local port 3306 is forwarded on the target VM (here the bastion host) to reach the port 3306 opened on the localhost (i.e. the target VM)”
Reuse the ssh key created automatically by Google during the first ssh connection to the bastion host Compute Engine (or scp). This private key is stored in the home directory of the current user ~/.ssh
-i ~/.ssh/google_compute_engine
Open a ssh connection through the existing IAP tunnel on the localhost environment and forwarding the ssh traffic from the local port 4226 -p 4226 localhost
When connected to the bastion host Compute Engine, you want to run the Cloud SQL proxy to create the tunnel to the Cloud SQL instance, on the port 3306. For this, run the command that you want on the remote (the bastion host) after a --
-- /tmp/cloud_sql_proxy instances=<connection_name>=tcp:3306

And put all of this together

ssh -L 3306:localhost:3306 \
  -i ~/.ssh/google_compute_engine \
  -p 4226 localhost \
  -- /tmp/cloud_sql_proxy instances=<connection_name>=tcp:3306

Now, you have it! Use your favorite database IDE, connect it on localhost:3306 and sign into your database with the user/password.

Wow!! All of this to be able to comply with a security pattern! Here a schema of what we have built

That’s “ugly”!!

Additional side effect

Using Cloud SQL with a private IP only adds caveats. Indeed, a peering is created between the VPC of the project and the Cloud SQL network (managed by Google Cloud).
And peering comes with 2 limitations:

This latest point is very important and can be a blocker is you want to reach the database instance from another project. Indeed, from the VPC of another project, you would like to create a peering with the project which has the Cloud SQL instance to reach, through the private IP.

But, because of the transitivity limitation, you can’t: the Cloud SQL private IP isn’t seen from the VPC of the other project.
The workaround here is to create a VPN to peer the 2 VPCs.
That’s euh… “ugliest”??

An additional side effect is the incapacity for App Script apps to use Cloud SQL instances if no public IP is defined.

“Smart” security pattern matters

Security matter and the existing patterns for databases work great… in the legacy world.
Now, with Cloud SQL and Cloud SQL proxy, additional security layers exist and void the old patterns.

What is the problem with having a public IP if no IP ranges are allowed to connect to it?

It’s the definition of a firewall rule in the legacy world, isn’t it? Deny all IPs to access this range/IP where are hosted my databases

Security team concern: How to be sure that no IP range is allowed?

This question is legit and that’s why you can enforce an organisation policy (Restrict Authorized Networks on Cloud SQL instances) to prevent the additions of public IP range on Cloud SQL instances, and this, company wide.

Eventually, allowing a public IP on Cloud SQL instances avoids a lot of workaround and strange designs to deal with, and without decreasing the security level.
Moreover, the created tunnel is encrypted and ensures a high level of security and confidentiality.

The cloud changes the paradigms (see Beyond Corp) and the security pattern need to be updated to comply with them.
Smart security patterns are better than legacy security patterns!