Containerizing kdb+ with Docker

5 min readNov 14, 2021

Introduction

In this post, I create a container image for a simple kdb+ process and deploy it using Docker, then deploy replicated instances using Docker swarm. It’s the first post about how open-source tools can be used to make kdb+ conform to modern standards in software design and architecture.

kdb+ is a column-based relational time series database with in-memory abilities and is used widely in the fintech and banking sector due to its real-time data processing capabilities. The language consists of a SQL-like query language called ‘qSQL’, and a concise, expressive programming language called ‘q’. It is interpreted, dynamically typed, table (and column) oriented, and expressions are evaluated in right-to-left order.

The initial learning curve is steep, but once you get your head around the terse syntax, concise error messages and overloaded glyphs, you can do some powerful things very quickly. One of the downsides is the decision by the creators of the language to make it proprietary. Commercial licenses are prohibitively expensive for many and this has probably contributed to a lack of adoption in industries outside fintech and investment banking. As a result, there are very few frameworks or universal kdb+ tick solutions, and many kdb+ stacks are built bespoke and maintained by dedicated in-house teams.

I developed an interest in how open-source tools could be leveraged with a bespoke application to achieve some of the features that often ship as standard with enterprise grade software solutions written in other languages:

Fault tolerance/resiliency
Disaster recovery
Scalability in response to variable workloads
Zero-downtime upgrades
Infrastructure and operations as code

For the last few months, I’ve been focusing on a couple of technologies which can help solve some of the above issues:

Docker for containerization of application code
Kubernetes for container orchestration, automated deployment, and application scaling

It should be noted at this point that Docker and Kubernetes are two distinct but complementary technologies, and there are plenty of articles devoted to the differences between the two. In this post, I demonstrate how I used them to deploy a highly available (HA), scalable instance of a basic kdb+ process.

Containerizing kdb+

A container is a standard unit of software that contains all code and dependencies required to run reliably on any computing environment. Containers are built from container images; lightweight, standalone, executable packages which are OS agnostic and can therefore be deployed anywhere and will work uniformly.

A vanilla kdb+ process requires three files:

q.k: contains functions which are loaded as part of the ‘bootstrap’ of kdb+
kc.lic: license file for kdb+ (I’m using the non-commercial 64-bit license)
q: the q binary file

aidanog: ~/kdb_container/kdb $ ls -lrth
total 832K
-rw-r--r-- 1 aidanog aidanog  24K Nov 13 17:34 q.k
-rwxr-xr-x 1 aidanog aidanog 797K Nov 13 17:34 q
-rw-r--r-- 1 aidanog aidanog  363 Nov 13 17:34 kc.lic
-rw-r--r-- 1 aidanog aidanog   35 Nov 13 18:01 example.q

I then added a fourth file, example.q which contains a simple Hello” function.

Now that we have everything required to run q locally, we need a way to recreate these conditions procedurally; this is where the ‘Dockerfile’ comes in. A ‘Dockerfile’ contains a list of commands which are executed in sequence to build our container:

Define a base image on which to build
Set some environment variables required to run ‘q’
Load some useful libraries like ‘rlwrap’ for enabling line-wrap on the ‘q’ REPL
Copy our kdb+ files in and set the working directory
Execute a command which will start a q process running and load ‘example.q’

FROM debian:9 AS base# do not clean here, its cleaned later!
RUN apt-get update \
        && apt-get -yy --option=Dpkg::options::=--force-unsafe-io upgradeMAINTAINER Aidan O'Gorman# Set env variables for q
ENV QHOME /kdb
ENV PATH ${PATH}:${QHOME}
# This should point to the license file location
ENV QLIC /kdb# Refresh / Update the base image using alpine's package manager "apk", and binutils to allow use of e.g. tar/ar while building
RUN apt-get -yy --option=Dpkg::options::=--force-unsafe-io --no-install-recommends install \
                ca-certificates \
                curl \
                rlwrap \
                runit \
                unzip \
        && apt-get clean \
        && find /var/lib/apt/lists -type f -deleteCOPY kdb /kdb
WORKDIR kdb
CMD ["q", "example.q", "-p","1234"]

Deploying kdb+ from a Container Image

Now that’s done, we simply build our container image and create our container in Docker:

aidanog: ~/kdb_container $ docker build -t kdb-in-container/1.0 .
...
[+] Building 35.2s (11/11) FINISHEDaidanog: ~/kdb_container$ docker image ls
REPOSITORY                              TAG                                                     IMAGE ID       CREATED              SIZE
kdb-in-container/1.0                    latest                                                  536358b12c76   About a minute ago   183MBaidanog: ~/kdb_container $ docker create kdb-in-container/1.0
e7f268521248e8078ecd8c73c74aa9dffc953b3df3726e1fe227e1210bdbb5d8aidanog: ~/kdb_container $ docker start e7f268521248e8078ecd8c73c74aa9dffc953b3df3726e1fe227e1210bdbb5d8
doe7f268521248e8078ecd8c73c74aa9dffc953b3df3726e1fe227e1210bdbb5d8aidanog: ~/kdb_container $ docker ps -a
CONTAINER ID   IMAGE                  COMMAND                 CREATED         STATUS        PORTS     NAMES
e7f268521248   kdb-in-container/1.0   "q example.q -p 1234"   6 seconds ago   Up 1 second             magical_cerfaidanog: ~/kdb_container $ docker exec -it e7f268521248 sh# q
KDB+ 4.0 2021.04.26 Copyright (C) 1993-2021 Kx Systems
l64/ 4(16)core 6251MB root e7f268521248 172.17.0.2 EXPIRE 2022.08.06 aidan.ogorman@outlook.com KOD #4177517q)
q)conn:hopen`::1234
q)conn(`hello;"Aidan")
"Hello \"Aidan\"!"

Horizontal Scaling in Docker

Now that we have an image which contains everything that is required to run our q process, we can easily create replicas of our process using a service and Docker swarm (a container orchestrator tool). I use Docker swarm here to keep things in Docker, but I prefer to use Kubernetes as a container orchestrator and will introduce it in future posts.

aidanog: ~/kdb_container $ docker swarm init
...aidanog: ~/kdb_container $ docker service create kdb-in-container/1.0loj5lp019g8haom00foxydpr2
overall progress: 1 out of 1 tasks
1/1: running   [==================================================>]
verify: Service convergedaidanog: ~/kdb_container $ docker service ls
ID             NAME            MODE         REPLICAS   IMAGE                         PORTS
loj5lp019g8h   adoring_yalow   replicated   1/1        kdb-in-container/1.0:latestaidanog: ~/kdb_container $ docker scale service adoring_yalow=5
docker: 'scale' is not a docker command.
See 'docker --help'aidanog: ~/kdb_container $ docker service scale adoring_yalow=5
adoring_yalow scaled to 5
overall progress: 5 out of 5 tasks
1/5: running   [==================================================>]
2/5: running   [==================================================>]
3/5: running   [==================================================>]
4/5: running   [==================================================>]
5/5: running   [==================================================>]
verify: Service convergedaidanog: ~/kdb_container $ docker service ls
ID             NAME            MODE         REPLICAS   IMAGE                         PORTS
loj5lp019g8h   adoring_yalow   replicated   5/5        kdb-in-container/1.0:latest

Et voila, we now have 5 replicas of our process running on our node. If we were running a typical kdb+ tick stack with historical data services, we could easily scale the number of available services in response to varying query load throughout the day.

This is a trivial example to show how Docker can help solve issues around scalability and availability of services. There are a couple of things to note:

these processes are not associated with a network and have no exposed ports, so are not accessible to ingress traffic
all processes are deployed on a single docker swarm node — if the node goes down, all of the processes go down with it

In the next article, I will show how we can leverage Kubernetes to automate the deployment and orchestration of containers.

Containerizing kdb+ with Docker

Introduction

Containerizing kdb+

Deploying kdb+ from a Container Image

Horizontal Scaling in Docker

Written by Aidan O'Gorman