Image created with resources from Freepik

From Docker to Elastic Self-Healing Infrastructure

by Pablo Borja

Pablo Borja
Jeff Tech
Published in
5 min readSep 21, 2018

--

Software Development has experienced a paradigm shift in recent years. In the maritime trade, shipping containers of standard sizes made trades more efficient. This is now happening in Software too, thanks to containers.

Docker is a containerization tool, and it’s becoming more and more frequent. It simplifies many things when we want to deploy a challenging infrastructure.

At Mr Jeff, we think that in the age of automation we should avoid having infrastructures that need human intervention. Infrastructures are becoming more and more complex and elaborated, and so there’s also a higher chance of human error.

We should try making infrastructures capable of healing themselves — meaning that if something goes wrong, it will fix itself, without human intervention. In some way, it’s like giving AI to your infrastructure.

Wait, wait… Something is wrong and I should do nothing? Tell me more!

Yes, providing intelligence to your infrastructure is a powerful feature.

With that intelligence, we can also make elastic infrastructures, which, in a few words, are infrastructures that are able to scale in or scale out depending on system load. At Mr Jeff, we’d like to avoid maintaining an expensive infrastructure when we don’t really need it, but similarly, we want to have a strong and big enough one whenever we have special events like advertisements, marketing campaigns, etc.

Moreover, Mr Jeff is a multinational company with users on different time zones, so we don’t really have a maintenance window. We should always be available, and users shouldn’t notice any downtime when our team is performing updates.

We didn’t make an infrastructure like this from the beginning, so we had to modify it as we progressed.

From a monolith to a containerized micro-service architecture

In the beginning, here at Mr Jeff we had a monolith infrastructure built over a third-party middleware. It worked fine, but as the business started growing, it became more and more limited in features, and it didn’t allow us to scale properly. The former point was addressed by setting up a new micro-service based architecture. As a solution for the latter, we targeted a distributed infrastructure capable of easier and better scaling.

As a first step, we took note of Spring & Netflix OSS for developing our own distributed architecture. But we were still doing deployments manually, and the infrastructure didn’t have enough intelligence to self-govern.

Now we have different needs, we have more traffic and we must be able to absorb the load. Our infrastructure was becoming more complex, and we needed self-healing and elasticity. Kubernetes helped us solve these needs, which in turn helped us improve our continuous delivery process and made us much more agile.

Why Kubernetes?

When we started looking for a new infrastructure, there were several options like Mesos with Marathon or Kubernetes. We decided to bet on Kubernetes — it was a growing infrastructure system with a lot of possibilities. Now more and more cloud providers have just started to support and even deploy Kubernetes as part of the services they offer.

A few years ago, deploying a Kubernetes cluster could be a challenging task, but now we have many tools like Kops, Kubespray, Rancher, etc., that simplify the process. Now, even some Cloud Providers are offering self-managed Kubernetes!

A migration to Kubernetes from a dockerized app is the logical step to follow. First, we have to convert our docker startup scripts to Kubernetes workload configuration files.

There are different workloads in Kubernetes, like DaemonSets, Jobs or Deployments, so we should understand which is the best option. If we don’t have any experience using this configuration files, there are tools like Kompose made for converting docker-compose files to Kubernetes configuration files. That’s a good start.

Then, we should define Services and Ingress configurations, which will expose our app to the outside.

The infrastructure swap was easy. With a Kubernetes cluster working properly, we just did a DNS permutation — and boom! Our whole infrastructure was now elastic.

How Kubernetes works

With Kubernetes we can be Self-Healing

With the deployment file we define a ReplicaSet, which ensures that a specified number of pod replicas are running at any given time. Pods are the smallest deployable units of computing that can be created, and they are in essence a Docker container or a set of them.

In this case we have 3 replicas. If one replica crashes, ReplicaSet will start another replica. In the same way, if one node with two replicas crashes, ReplicaSet will schedule 2 replicas in another healthy node. However, we must be careful of how we distribute our replicas. Having all replicas deployed on the same node is usually not a good idea because if a node goes down, we will get a downtime of that service. In addition, each deployment has a pattern of CPU load. If we want a good CPU load distribution, we should combine them.

In a Spring architecture, a service discovery is needed, and Eureka does this job. But in Kubernetes the cluster does this job for us. In the next example, we can see two deployments. Both are distributed across different nodes. Kubernetes has an object called Service that carries Eureka’s job. A Kubernetes Service is an abstraction which defines a logical set of PODs. We can access PODs using Services.

Service abstraction is very useful — we can connect our deployments using the service name. In the previous example, if any Node goes down, App will still work without any issue. When Service is called, Kubernetes will not send traffic to unhealthy PODs, and ReplicaSet will schedule new PODs in order to comply the Deployment specification.

Kubernetes makes us High-Available

Another important advantage of Kubernetes clusters are Deployment updates. We don’t want downtimes when a new version of a Mr Jeff app or site is deployed.

Thanks to Kubernetes, we can update our Docker containers without any downtime. Imagine we have a Deployment with 2 replicas and we update the container image. Kubernetes will create a new container with the new image. When the new container is ready, Kubernetes will replace the old one with the new one, then erase the old one, and then repeat the same process with the other container.

This process is called Rolling Update. We could also set up Blue/Green deployment using Services and Labels.

Conclusions

Docker and Kubernetes are awesome tools. Since we had already made the effort of building a microservice architecture based on Docker containers, it made sense to continue that effort by improving our infrastructure.

Configuring Kubernetes in the right way can make an infrastructure capable of being elastic, self-healing, and highly available. This intelligence allows us to create more robust distributed infrastructures.

--

--