ODBC, Private Service Connect and Proxies

Neil Kolban
Google Cloud - Community
5 min readOct 22, 2021

Recently I was presented with a puzzle by a client. The client has an on-premises network that is connected to GCP as shown in the following diagram.

Laptops on the clients network are LAN connected and, when they want to reach the Internet go through an outbound proxy. The on-premises network is also connected by a VPN to a GCP VPC which has a Private Service Connect (PSC) instance configured providing a route to BigQuery which does not involve routing over the Internet. The client is also using VPC Service Controls to prevent any access to BigQuery from the Internet as the BigQuery database is hosting extremely sensitive data.

On the laptop, the client is using a SQL application that accesses BigQuery using the Simba ODBC driver. When the client ran their application, it failed to connect to BigQuery. When we examined the puzzle, we found that Simba ODBC is hard-coded to access BigQuery by sending HTTPS requests to bigquery.googleapis.com. Since this domain name resolves to the public IP address, the laptop was sending requests outbound through the on-premises proxy to the Internet. On receipt, GCP was rejecting the request since it arrived over the Internet and not internally through the VPC. This constraint was put in place by PSC.

What we want to achieve is that the BigQuery API requests originating from the laptop are instead routed to PSC for processing. We considered a number of approaches.

The first was to edit the hosts file that exists locally on the laptop to direct the domain name resolution of bigquery.googleapis.com to the RFC 1918 IP address of PSC. This worked but the effort and exposure of asking users to edit the files was considered too high.

The second possibility was to override the DNS nameserver configuration for the Enterprise as a whole to create a mapping from bigquery.googleapis.com to the RFC 1918 address of PSC. Again, this was tested and worked but there are some unpleasant side effects. The Simba ODBC documentation requires mapping of:

  • bigquery.googleapis.com
  • bigquerystorage.googleapis.com
  • oauth2.googleapis.com
  • www.googleapis.com
  • accounts.google.com

If we change the Enterprise wide DNS mappings then we are effectively asking all traffic across the whole enterprise to route through PSC for these names and that can have a broader effect than what we want to achieve just for our own ODBC usage.

After some deeper investigation, an additional approach offered itself. The Simba ODBC driver allows the configuration of an HTTP Proxy. What this means is that when the driver wishes to make an API request (eg. to BigQuery), instead of making that request directly, the driver will send the request to the HTTP Proxy and the HTTP Proxy will make the request on behalf of the driver.

We can see this pictorially. When a source system makes an HTTP request to a target system, the source normally makes the request directly.

when a Proxy is introduced, the picture changes to:

As before, the source thinks it is sending the request to the target but instead the request is sent to the proxy which sends the request onwards. The proxy has the option to change where the actual routing is to go. This means that the proxy can send requests which the source thinks it is sending to bigquery.googleapis.com to somewhere else. The value of this is that no configuration changes are needed at the source.

If we add this to our diagram, we now see the introduction of a proxy within GCP that becomes the target of the Simba ODBC driver that knows to route requests to PSC.

This feels like an elegant solution. The one downside is that we need to configure and manage the proxy. Let us now walk through a sample configuration to see it all work.

  1. Create a Private Service Connect definition at IP address 10.1.0.1.
  2. Create a Compute Engine that is running Linux that will be our proxy.
  3. Install a proxy server. In our example we will use tinyproxy:
sudo apt-get install tinyproxy

4. Edit the /etc/tinyproxy/tinyproxy.conf file

Comment out

Allow 127.0.0.1

Uncomment

Filter "/etc/tinyproxy/filter"

Uncomment

FilterDefaultDeny Yes

5. Edit /etc/tinyproxy/filter

Specify the domain names that we wish to allow. All other domain names will be blocked. This will ensure that no malicious access can use the proxy.

^bigquery\.googleapis\.com$
^oauth2\.googleapis\.com$

6. Edit /etc/hosts and change:

10.1.0.1 bigquery.googleapis.com
10.1.0.1 oauth2.googleapis.com

This will cause the resolution of these domain names to point to the PSC entry at 10.1.0.1.

7. Stop tinyproxy to have the changes take effect.

sudo service tinyproxy stop

8. Restart tinyproxy for it to start service requests.

sudo service tinyproxy start

9. On the Windows machine, open up the ODBC definition and set the proxy entries:

Specify the IP address of the Compute Engine for the Proxy Host and 8888 for the Proxy Port.

Click the test button and we should see the results:

10. View the content of a table using an ODBC SQL tool such as RazorSQL.

Summary: By following this recipe, we have demonstrated that we can configure the Simba ODBC driver to use an HTTP Proxy that we have installed on a Compute Engine. The domain name resolution on that Compute Engine results in the API calls originated from Simba ODBC to be sent to the Private Service Connect IP address and thus satisfied without ever going over the Internet and without “patching” the DNS names on the machine hosting the Simba ODBC driver.

--

--

Neil Kolban
Google Cloud - Community

IT specialist with 30+ years industry experience. I am also a Google Customer Engineer assisting users to get the most out of Google Cloud Platform.