Making Kubernetes Approachable — Our Experience with Kops and Rancher

A Real World Example of Enabling HA and Security for Kubernetes Clusters

Hashmap, an NTT DATA Company

Published in

Hashmap, an NTT DATA Company

6 min readJun 26, 2018

by Chris Herrera and Randy Pitcher

Here at Hashmap, we work to continually improve our Tempus IIoT framework, and we test a number of approaches to specific problems using a wide variety of tools. We actively balance our technical goals for Tempus (high value, scalable, flexible deployment models for cloud and on premise, minimum resource requirements, etc.) against the need to keep the solution practical, simple to use, and easy to manage and deploy.

With that being said…Have you ever done a manual install of Kubernetes?

We’ve done it and it’s not a task that should be taken lightly — it is an absolute beast, especially if you want to do HA and spread it across availability zones and things of that nature.

Why Kubernetes for the Tempus Software Engineering Team at Hashmap?

For Tempus IIoT, we have an ingestion framework that requires high availability. In many cases, the IOT devices have limited storage capacity, so Tempus must always be available and able to accept data. The requirements vary with a field gateway concept, but when the goal is to perform surveillance on a IoT data stream in an operational technology sense, reliability is a must.

When that’s the case, a declarative system like Kubernetes is incredibly valuable for us. The ability to easily tell Kubernetes “we need 3 pods always running and do whatever you have to do to make it happen and tell me if you can’t” is critical. This type of behavior is perfect for an IoT ingestion framework.

So what are some tools that we have been using to make Kubernetes more approachable?

Kops

Kops is a tool for creating Kubernetes clusters in a production environment. It’s officially provided as part of the Kubernetes project, under the Kubernetes repo, and also managed by the Kubernetes team. It works great on AWS and GCE cloud instances. AWS does require an S3 bucket for a state store and you will want to make sure to secure that bucket. .

For Role Based Access Control (RBAC) Kops, unlike other solutions such as AKS on Azure and EKS on Kubernetes, does not by default create an RBAC enabled cluster, so the cluster will be insecure unless a few additional steps are taken, which aren’t too difficult, but it’s not the default. Kops provides a command line console where you can provide your AWS access key and secret key.

With Kops, it’s straightforward to specify your needs, for example, 3 masters, node size (T2 Medium on AWS for the budget conscious), and then select the availability zones. So for AWS, just specify US East 2 ABC with ABC being the availability zones. From there power, infrastructure, etc. are all replicated across the 3 AWS zones and 2 will be up and running at any given time.

To modify clusters in Kops, it’s very declarative just like Kubernetes — if scaling up/down is needed, it’s simply an update to the Kops configuration.

Rancher

Kops is really nice if you have a good understanding and working knowledge of Kubernetes, and you know the cloud provider and their services such as how S3 works. If you do, then go for it.

The problem is that the level of understanding or skillsets described are many times lacking, so that is where Rancher can help.

The Rancher solution is described as “Enterprise Management for Kubernetes. Every distro. Every cluster. Every cloud.” We’ve found it to be a good option in terms of making Kubernetes easier to deal with because it provides a nice web UI to create and manage Kubernetes clusters on really just about anything. Rancher has drivers for bare metal, all the cloud providers, DigitalOcean, OpenStack, you name it — they pretty much have it.

Rancher 1.0 vs 2.0

Rancher has taken a different approach to their 2.0 version which was just released in May. For background, when Rancher 1.0 was first introduced, it was installed on a separate machine (separate from the Kubernetes cluster it was managing).

Rancher 2.0 was built on Kubernetes and uses an embedded Kubernetes cluster, so effectively, Rancher on Kubernetes is being used to manage Kubernetes — it’s a bit unclear if this is the direction that Rancher ultimately wants to take with the solution.

It’s reasonable to assume that abstracting the complexity of Kubernetes is something that some people may want, but realistically, from an SRE perspective, a deep understanding of the infrastructure that the application is running on is fundamental.

For HA in Rancher 1.0, Rancher instances were managed separately and of course the same Kubernetes cluster could be used for HA, but it was not advised — in fact, we broke it over, and over, and over again.

Interestingly though, 2.0 made everything more cryptic even with Rancher looking to take more complexity of Kubernetes out of Kubernetes.

In the Real World

The problem is that when running a system of any size, you need to keep the “Kubernetes in Kubernetes”.

Time and again, our team needed to get metrics that were not available via the Rancher UI. The UI certainly looks good, but when errors are cryptic, it can create issues.

When setting up our latest Kubernetes cluster, we started with Rancher 2.0 and immediately flipped over to Kops due to the cryptic nature of the error messages in the Rancher UI. Granted, at the time, Rancher 2.0 was still in beta and limited documentation was available.

Speaking of documentation, and comparing both solutions, Rancher is a wonderful open source project with decent documentation. But, Kubernetes documentation is cream of the crop, and with the activity levels on the Kubernetes repo, it’s hard to compete with that — it’s just really good documentation — and we don’t tend to say that too often about open source projects.

Some Considerations

A couple of concerns that come to mind regarding going down the Rancher path…

There will be a reliance on Rancher to integrate the upstream versions of Kubernetes into RKE, although Rancher committed to doing this
Ultimately, what is the Rancher team looking to achieve and what is their design rationale — make Kubernetes easier to use, improve the user experience, etc.

Final Thoughts — Go Kops If You Are Going Big

There is a law of conservation of complexity that exists. Complexity will exist somewhere. “Internalizing” complexity and exposing just a subset of Kubernetes functionality (the Rancher approach) will need to hit the right target audience.

If you are running a large scale SaaS offering, Kops is going to be the way to go for now.

At Hashmap, we have a highly secure Kubernetes cluster, in fact, even the API servers are authenticated calls. We found that doing this in Rancher was extremely difficult to achieve, but with Kops it was just 2 commands.

In a future post, we will provide a step-by-step guide on how to achieve this type of security.

Need Help with Kubernetes and DevOps?

If you’d like additional assistance in this area, we offer a range of enablement workshops and consulting service packages as part of our consulting service offerings, and would be glad to work through your specifics in this area.

Feel free to share on other channels and be sure and keep up with all new content from Hashmap at https://medium.com/hashmapinc.

Chris Herrera is a Senior Enterprise Architect at Hashmap working across industries with a group of innovative technologists and domain experts accelerating high value business outcomes for our customers. You can follow Chris on Twitter, connect with him on LinkedIn, or check out the Tempus IIoT project.

Randy Pitcher is a Big Data and IoT Developer at Hashmap working across industries with a group of innovative technologists and domain experts accelerating high value business outcomes for our customers. You can connect with him on LinkedIn or collaborate with him on the Tempus IIoT project.

Also, be sure to catch their Weekly IoT on Tap Podcast for a casual conversation about IoT from a developer’s perspective.