Kubernetes Deployment Dependencies
An itch that’s been wanting a scratch!
I’m working on a Helm Chart for a Blockchain platform. Helm helps with templating deployments though I find myself duplicating lots of boilerplate. A bigger issue has been in attempting to reflect dependencies between resources during deployment.
I’ve written a small proof-of-concept that I think helps and am interested in feedback. I realized that I want not only a (distributed) lock service (etcd? Chubby?), but also more transparent Kubernetes resources for orchestrating these dependencies.
The Problem
I have 2 Pods: first
, second
. I want second
to block until first
is “available”. This is commonly required for databases and their clients but the problem is — I think — a general one.
The solution that I’ve encountered is to share a volume between Pods and to use the file-system as the locking database. Something of the form:
kind: Pod
metadata:
name: pod-a
spec:
volumes:
- name: shared-pvc
initContainers:
- name: await
command:
- sh
- -c
- |
while [ ! -f /shared/created ]; do
echo Waiting for something to be created
sleep 5s
done
echo created
volumeMounts:
- mountPath: /shared
name: shared-pvc
This has several advantages:
- It leverages “Everything is a File” in Linux; available everywhere
- It’s easy to understand
But it has several disadvantages
- It’s opaque to Kubernetes and beyond the Pods|containers
- It requires a shell to wrap even a single binary
- It requires read-write-many volumes
- It’s not pluggable
The following is a proof-of-concept for an alternative approach which I suspect is what etcd, Chubby, Zookeeper and others would provide. If I can convince myself of the merits of this approach, I plan to try using etcd or possibly surfacing Google’s Runtime Config service to Kubernetes.
The Proof-of-Concept
I wrote a simple Golang httpd service that accepts GETs to retrieve “variables” and POSTs to create them. The service itself is created against a global configuration value.
NB As I write this, I realize that an immediate improvement is to make the global configuration value dynamic rather than static. This would then be provided with each GET and POST to partition into namespaces.
So, for example to get the variable a/path/my/variable
returns 200 if the variable has been created:
curl \
--request GET \\
http://simple-config:9999/?variable=a/path/my/variable
[200]
The following command creates the above variable:
curl \
--request POST
--header "Accept: application/json" \
--data '{"variable":"a/path/my/variable"}' \
http://simple-config:9999
What benefit does this yield?
Unsure.
Here’s an example implementation of first
:
This revises the initContainer
(lines 12–29). It uses busybox
(which has wget
not curl
). It grabs the container’s name from the Downward API (e.g. first
), concatenates it to init
and creates a file in the service’s volume first/init
.
Checking the service’s volume, we find:
.
└── first
└── init
The main container (called container
) blocks on completion of its initContainers
. By the time container begins, we may be confident the first/init
file has been created. The container then creates a second file called first/container
:
.
├── first
├── container
└── init
All good!
We can then deploy Second
:
This Pod (Second
) must block until first
is ready. Our protocol is that first
will create a variable first/container
when it is ready. So, Second has an initContainer
that waits for this variable’s endpoint to become ready (200). This code appears in lines 16–27.
Unfortunately (!) this required a shell (a requirement I was hoping to avoid) *but* I propose replacing this with a simple Golang binary that polls a given endpoint. This would be similar to the Healthcheck alternative I wrote about previously (link).
As before, the main container (container), uses busybox’s wget to create another variable noting the successful completion of this container (second/dependent
).
After both first
and second
are complete, we have:
.
├── first
│ ├── container
│ └── init
└── second
└── dependent
These trees show a side-effect benefit of this approach which is that, although the implementation currently uses a file-system, the file-system is opaque to the clients that are creating/reading variables. An alternative implementation (etcd, Runtime Config) could be applied without rewriting the container manifests.
Here’s Google’s Console showing the Service deployed and both first
and second
completed:
And, drilling into each:
and:
and the logs for the wait
container:
Alternatively, you may view the wait container’s logs from the command line:
kubectl logs pod/second \
--container=wait \
--namespace=${NAMESPACE} \
--context=${CONTEXT}fetch http://dl-cdn.alpinelinux.org/alpine/v3.7/...
fetch http://dl-cdn.alpinelinux.org/alpine/v3.7/...
(1/4) Installing ca-certificates (20171114-r0)
(2/4) Installing libssh2 (1.8.0-r2)
(3/4) Installing libcurl (7.60.0-r1)
(4/4) Installing curl (7.60.0-r1)
Executing busybox-1.27.2-r7.trigger
Executing ca-certificates-20171114-r0.trigger
OK: 6 MiB in 15 packages
Checking: http://simple-config:9999/?variable=first/container
Ready
Here you can see that the initContainer
blocks the start of second
container until the variable becomes available which, in this case, happens promptly.
I’ve shown using tree
to view the contents of the directory that’s backing the simple-config
service. To access the debugging container (debug) that’s associated with the simple-config service (see deployment.yaml
below lines 39–44), we must first determine the pod’s name. Because the container has a label (component: debug
) associated with it, we can quickly select it. We then use JSONPath to grab the (assumed 1st and only) pod’s name:
kubectl get pods \
--selector=component-debug \
--output=jsonpath="{.items[0].metadata.name}" \
--namespace=${DEFAULT} \
--context=${CONTEXT}
We can combine this command with a kubectl exec
to access the ash
shell in the debug container:
kubectl exec \
--stdin \
--tty
$(\
kubectl get pods \
--selector=component-debug \
--output=jsonpath="{.items[0].metadata.name}" \
--namespace=${NAMESPACE} \
--context=${CONTEXT}) \
--namespace=${NAMESPACE} \
--context=${CONTEXT} \
-- ash
NB You may drop both sets of the
— namespace
and— context
flags if you’re using the defaults.NB The first time you exec into the container, you may wish to install
tree
(You'll need to update first:apk update && apk install tree
).
You may then check the contents of the /config
directory:
tree /config/config
├── first
│ ├── container
│ └── init
└── second
└── dependent
simple-config
is configured in 3 places.
First, the process (main.go
) can be configured by the environment. A variable SIMPLE_CONFIG_PATH
is used to define an absolute path to the storage of variables. This defaults to the current directory (.
). A variable SIMPLE_CONFIG_PORT
is used to define the listening port. This defaults to 8080
.
The simple-config
Service is then configured to use /config
directory and port 9999
(arbitrarily) by defining env
key:value pairs in deployment.yaml
lines #32–35. These include a reference to the container’s /config
. This is actually a volume mount of a volume called config
volume (lines 33–34). The volume itself is defined as an EmptyDir
in lines 24–25.
For convenience and or comparison, the equivalent Docker command to run the container locally would be:
HOST_PATH=...
HOST_PORT=...
CONT_PATH=...
CONT_PORT=...docker run \
--interactive \
--tty \
--volume=$PWD/${HOST_PATH}:${CONT_PATH} \
--env=SIMPLE_CONFIG_PATH=${CONT_PATH} \
--env=SIMPLE_CONFIG_PORT=${CONT_PORT} \
--publish=${HOST_PORT}:${CONT_PORT} \
dazwilkin/simple-config:v1
Golang: Server
Here’s the code for the simple-config
server implementation:
Here’s its Dockerfile:
And here’s a Deployment manifest for it:
Golang: Client
Here’s an implementation of a client for the simple-config
service:
A Dockerfile with a tweaked entrypoint
that catches the Golang binary too:
https://gist.github.com/DazWilkin/b9044de454691eef93dc69900c340b72
And a Pod manifest:
This time, we’re shell-less not for any aversion to shells but to be more explicit about intent. The initContainer
references the client (dazwilkin/simple-config/client
) and invokes exists $(VARIABLE)
to check whether the variable (plus/container
) exists.
Subsequently, the container
container, invokes create $(VARIABLE)
to mark its completion.
Conclusion
I’m undecided on the merits of this approach. It addresses some of my concerns, not all of them and adds some complexity. I continue to believe that Kubernetes should have a mechanism for automating this type of deployment complexity.
What alternatives exist?