Move your analytics, not your data!

ZD
GoodData Developers
3 min readFeb 25, 2021

Containers open a whole new set of options for data analytics. Deploying and configuring a complete analytical stack can take less time than executing a query in your database.

GoodData has launched the modular analytical stack GoodData.CN that can be deployed as a single Docker image or as an elastic k8s application. The deployment and configuration of this stack can be easily automated. Let me share a quick example of how this stack works.

1. Pull Docker image

docker pull gooddata/gooddata-cn-ce

I tried this from my local Mac, Google cloud VM, and AWS EC3. In all cases, I was able to install the stack in ~ 20s (this depends on your network speed).

2. Run the image

docker run -t -i -p 3000:3000 gooddata/gooddata-cn-ce

After the container starts, you can access it on the http://localhost:3000

3. Connect to the database

curl http://localhost:3000/api/data-sources \ 
-H “Content-Type: application/vnd.gooddata.api+json” \
-H “Accept: application/vnd.gooddata.api+json” \
-H “Authorization: Bearer YWRtaW46Ym9vdHN0cmFwOmFkbWluMTIz” \
-X POST \
-d ‘{
“data”: {
“attributes”: {
“name”: “demo-ds”,
“url”: “jdbc:postgresql://localhost:5432/demo”,
“schema”: “demo”,
“type”: “POSTGRESQL”,
“username”: “demouser”,
“password”: “demopass”,
“enableCaching”: true
},
},
“id”: “demo-ds”,
“type”: “data-source”
}
}’

As you can see, this is a simple JDBC connection to my local Postgres database.

3. Create a workspace

curl http://localhost:3000/api/workspaces \
-H “Content-Type: application/vnd.gooddata.api+json” \
-H “Accept: application/vnd.gooddata.api+json” \
-H “Authorization: Bearer YWRtaW46Ym9vdHN0cmFwOmFkbWluMTIz” \
-X POST \
-d ‘{
“data”: {
“attributes”: {
“name”: “Demo”
}
},
“id”: “demo”,
“type”: “workspace”
}’

The workspace is a sandbox for a specific analytics scenario. It can be also used for deploying the same analytical scenario many times to different tenants (e.g. your customers).

4. Create an analytical model

curl http://localhost:3000/api/workspaces \
-H “Content-Type: application/vnd.gooddata.api+json” \
-H “Accept: application/vnd.gooddata.api+json” \
-H “Authorization: Bearer YWRtaW46Ym9vdHN0cmFwOmFkbWluMTIz” \
-X POST \
-d ‘{
“mappingOnly”:false,
”mode”:”append”,
”scanTables”:true,
”scanViews”:false,
”separator”:”__”,
”tablePrefix”:””,
”viewPrefix”:””,
”primaryLabelPrefix”:””,
”secondaryLabelPrefix”:”ls”,
”factPrefix”:”f”,
”datePrefix”:””,
”grainPrefix”:”gr”,
”referencePrefix”:”r”,
”grainReferencePrefix”:””,
”denormPrefix”:””
}'

This API scans the connected Postgres database for table and creates a Logical Data Model on top of the Postgres data. The model is returned as a JSON structure.

I published the model via HTTP PUT to my workspace. GoodData provides a Swagger UI console in their stack (http://localhost:3000/apidocs) for API documentation. I was able to execute the model publishing PUT request from there

Swagger UI console is part of the stack. Sending HTTP PUT to publish the data model to workspace

Here is the model in the GoodData visual model editor.

I was able to create dozens of the workspaces with this setup with a few lines of Python code (LMK if you’re interested, I can share).

5. Analytics!

Go to http://localhost:3000/analyze/#/demo/ and start creating beautiful data visualizations.

GoodData Analytical Designer

The data visualizations can be embedded as components (React, Vue, Angular, etc.) to a web or mobile application. I played with GoodData.UI framework and was able to generate a boilerplate of a web application in few seconds.

Summary

With lightweight Python scripting, it took me less than 1 minute to deploy, and configure nice self-service analytics running in a container co-located with my database. 2 command lines, and 4 curl API calls.

GoodData Community Edition

Free GoodData.CN Community Edition is available from Docker Hub. You can read more about it on the GoodData website.

--

--