Working with on-premises databases — Step by Step
You might quickly find yourself working with notebooks or RStudio in IBM Data Science Experience (DSX) and get to a point where you would like to access some databases that reside behind some firewall, for instance in your own company. Here is how you can make that possible using a Db2 Warehouse system deployed on-premises.
You need a mechanism to tunnel a database connection across the firewall. The Bluemix Secure Gateway service provides a very easy solution for that. It allows you to set up a gateway client behind your firewall that maintains the tunneled connection to the service in Bluemix, which in turn exposes it to DSX as if it were a database in the cloud.
Here is the entire end-to-end procedure to set up the secure tunnel to a Db2 Warehouse deployment and use it in DSX. The same procedure can also be used with other on premise database and data warehouse systems such as Db2 LUW or BigSQL.
Sign in to your Bluemix account and navigate to the Catalog. Under Services, select “Integrate” and click on “Secure Gateway”:
Create an instance of the service by clicking the create button:
Click “Add Gateway” and specify a custom name of your gateway:
Click on the new gateway you just created to open its configuration. There select the tab “Destinations” to the bottom left and kick the “+” sign to add a new destination:
Select the “On-Premises” radio button for the location of the system that you want to access, and then click “Next”:
Type in the IP address or server name of your machine running your Db2 Warehouse instance. Also type in port number 50000 (or 50001 if you want to use SSL encrypted port of Db2 Warehouse). Then click “Next”:
Keep the default protocol “TCP” for connection to the destination and click “Next”:
You can keep the authentication setting to “None” (otherwise you would have to deploy encryption keys to the gateway components). Then click “Next”:
Click “Next” again if you don’t want to set up additional IP rules:
Give the destination a meaningful name and click “Finish”:
Your destination is now configured inside the secure gateway service, but it is not yet connected to your on-premises database. To connect, you first need to set up the gateway “client” behind your firewall. To do so select the “Clients” tab at the bottom right and click on “Connect Client”:
Select “Docker” as the method to connect to the gateway. Copy the “docker run …” command and paste it into a terminal session on your system that is behind the firewall and run it. This could be the same machine where your Db2 Warehouse docker container runs or another system in the same network:
You will get a command prompt of the gateway client. There you now need to set up the access control list of endpoints that this client is allowed to expose via the gateway. To do so type “acl allow Db2host:Db2port”:
(Note, when you want to gain access to the command prompt later on again you can simply re-attach to the docker container with “docker attach container_name”. To find out the name of your docker container issue “docker ps”.)
In your Secure Gateway Bluemix service, you now see that the gateway is connected:
Next, open the “Settings” dialog of the destination (the gear-wheal icon) to see its details. Copy the cloud hostname and port:
Now you can login to Data Science Experience and define a connection to your on-premises Db2 Warehouse system. Open one of your projects and click on “add data assets” and then the “Create Connection” button:
Provide a name for your connection, select service category “External”, and paste in the cloud host and port name that you copied from the destination in your secure gateway. Specify database name “BLUDB” and specify your user and password of your Db2 Warehouse system. Then click “Create”:
Now you can use that connection in a notebook. For instance use a sample notebook like this one and use the “Insert to code” function to paste in the connection data for your Db2 Warehouse system: