Kubernetes Introduction for VMware Users

Hany Michael
15 min readNov 12, 2017

--

This is the second part of my “Kubernetes in the Enterprise” blog series. As I mentioned in my last article, it is important to get everyone to the same level of understanding about kubernetes before we can proceed to the design and implementation guides.

I am not going to take the traditional approach here to explain the kubernetes architecture and technologies. I will explain everything through comparisons with the vSphere platform that you, as a VMware user, is already very familiar with. You can say that this was the approach I would have liked someone to use to introduce K8s to me. The latter could be very confusing and overwhelming to understand at the beginning. I’d like to add also that I used this approach internally at VMware to introduce kubernetes to some audience (from different practices), and it has proven to work great and get people up to speed quickly with the core concepts.

An important note though before we kick this off. I am not using this comparison for the sake of it, or to prove any similarities or differences between vSphere and Kubernetes. Both are distributed systems at heart, and they must have similarities like any other similar system out there. What I am trying to achieve here at the end of the day is to introduce an incredible technology like kubernetes to the very wide VMware community.

Figure: The Kubernetes overall architecture compared to vSphere

A little bit of history

You should be already familiar with containers before reading this post. I am not going to go through those basics as I am sure there are so many resources out there that talk about this. What I see very often though when I speak with my customers, is that they cannot make much sense of why containers have taken our industry by storm and become very popular in record time. And to answer this question, and in fact to set the context for what is coming, I may have to tell you a little bit about my history as a practical example of how I personally made sense of all that shift that is happening in our industry.

I used to be a web developer back in 2003 before I got introduced to the telecom world, and it was my second paying job after being a network engineer/admin. (I know, I was the jack of all trades back then). I used to code in PHP, and I’ve done all sorts of applications from small internal apps used by my employer, all the way to professional voting apps for TV programs, to even telco apps interfacing with VSAT hubs and interacting with the satellite systems. Life was great except for one major hurdle that I am sure every developer can relate to. The dependencies.

I first code my app on my laptop using something like the LAMP stack, and when it works good, I upload the source code to the servers, be it hosted out on the internet (anyone remembers RackShack?) or on private servers for our end-customers. As you can imagine, as soon as I do that, my app is broken and it just won’t work on those servers. The reason, of course, is that the dependencies I use (like Apache, PHP, MySQL, etc). have different releases than what I used on my laptop. So I have to figure out a way to either upgrade those releases on the remote servers (bad idea), or just re-code what I did on my laptop to match the remote stacks (worst idea). It was a nightmare, and sometimes I hated myself and questioned why am I doing this for a living.

Fast forward 10 years, and along came a company called Docker. I was a VMware consultant in professional services (2013) when I heard about docker and let me tell you that I couldn’t make any sense of that technology back in those days. I kept saying things like: why would I run containers when I can do that with VMs? Why would I give up important features like vSphere HA, DRS or vMotion for those weird benefits of booting up a container instantly or skipping the “hypervisor” layer — everybody is running VMs and they work great and so on and so forth. In short, I was looking at this from a pure infrastructure perspective.

But then I started looking closer until it just hit me. Everything docker is all about relates to developers. Only when I started thinking like one I immediately got it. What if I had this technology back in 2003 and packaged my dependencies? My web apps would work no matter what server they will run on. Better yet, I don’t have to keep uploading source code or setting up anything special. I can just “package” my app in an image and just tell my customer to download that image and run it. That’s an every web developer dream!

So this is all great. Docker solved a huge, huge issue for interop and packaging but now what? As an enterprise customer, how can I operate this app at scale? I still want my HA, my DRS, my vMotion and my DR. You solved my developer problems and you created a whole bunch of new ones for my operations (aka devops teams). The latter team needs a platform to run those containers the same way they used to run VMs. And we are back to square zero.

But then along came google to tell the world that it has been actually running containers for years (and in fact invented them — google: cgroups), and that the proper way to do that is through a platform they called Kubernetes. They then open sourced it, gave it as gift to the community and that changed everything again.

Understanding Kubernetes by comparing it to vSphere

So what is Kubernetes? Simply put: it is to containers what vSphere was for VMs to make them datacenter ready. If you used to run VMware Workstation back in the early 2000s, you know that they were not seriously considered for running inside datacenters. Only when VI/vSphere came with vCenter & ESXi hosts, when the huge impact of VMs happened in the world. This is almost the same today. Kubernetes is bringing us a way to run and operate containers in a production-ready manner. And this is why we will start to compare vSphere side by side with Kubernetes to explain the details of this distributed system to get you up to speed with its features and technologies.

Figure: The VM evolution from Workstation to vSphere compared to the current evolution for containers to Kubernetes

System Overview

Just like vSphere’s vCenter and ESXi hosts, Kuberentes has the concept of Master and nodes. In this context, the K8s master is equivalent to vCenter in that it is the management plane of the distributed system. It is also the APIs entry point where you interact with your workloads management. Similarly, the K8s nodes act as the compute resources like ESXi hosts. This is where you run your actual workloads (in K8s’ case we call them pods). The nodes could be virtual machines or physical servers. In vSphere’s case, of course, the ESXi hosts have to be always physical.

You can see also that K8s has a key-value store called “etcd”. It is similar to vCenter Server DB in that you store there the cluster configuration as the desired state you want to adhere to.

On the differences side, K8s master can also run workloads, while vCenter cannot. The latter is just a virtual appliance dedicated for management. In K8s Master case, it is still considered a compute resource but it is not a good idea to run enterprise apps on it. Only system related apps would be fine.

So how does this look like in the real-world? You will mainly use CLI to interact with this system (but GUI is still a very, very viable option). In the screenshot below, you see that I am using a Windows machine to connect to my kubernetes cluster via command like (I am using here cmder in case you are wondering). We see in the screenshot that I have one master and 4 x nodes. They run K8s v1.6.5, and the nodes operating system is Ubuntu 16.04. At the time of writing, we are mainly living in a Linux world here where your master and nodes are always based on Linux distributions.

Screenshot: Managing kubernetes using cli and gui

Workloads Form-factor

In vSphere, a virtual machine is the logical boundary of an operating system. In Kuberentes, Pods are the boundaries for containers. Just like an ESXi host that can run multiple VMs, a k8s node can run multiple pods. Each Pod gets a routed IP address just like VMs to communicate with other pods.

In vSphere, application run inside OS, while in Kubernetes applications run inside containers. A Virtual Machine can run one single OS, while a Pod can run multiple containers.

This is how you can list the pods inside a k8s cluster using the kubectl tool from the CLI. You can check the health of the pods, the age, the IP addresses and the nodes they are currently running inside.

Management

So how do we manage our master, nodes and pods? in vSphere we use the Web Client to manage most (if not all) the components in our virtual infrastructure. This is almost the same with kubernetes with the use of the Dashboard. It is a nice GUI based web portal where you can access with your browser just like what you do with the Web Client. We’ve seen also in the previous sections that you can manage your k8s cluster using the kubeclt command from the CLI. It’s always debatable where you will send most of your time in, the CLI or the Dashboard especially that the latter is becoming more powerful every day (check this video for more details). I personally find the Dashboard very convenient to quickly monitor the health or show the details of the various k8s components rather than typing long commands. It’s a preference and you will find the balance between them naturally.

Configurations

One of the very profound concepts in Kubernetes is the desired state of configurations. You declare what you want for almost any kubernetes component through a YAML file, and you create that using your kubectl (or through dashboard) as your desired state. Kubernetes will always strive from this moment onwards to keep that as a running state in your environment. For example, if you want to have 4 replicas of one pod, k8s will keep monitoring those pods and in case one died or the nodes its running had issues, it will self-heal and automatically create that pod somewhere else.

Back to our YAML configuration files, you can think of them like a .VMX file for a VM, or a .OVF descriptor for a virtual appliance that you want to deploy in vSphere. Those files define the configuration of the workload/component you want to run. Unlike VMX/OVF files that are exclusive to VMs/Appliances, the YAML configuration files are used to define any k8s component like ReplicaSets, Services, Deployments, etc. as we will see in the coming sections.

Virtual Clusters

In vSphere we have physical ESXi hosts grouped logically to form clusters. We can slice those clusters into other virtual clusters called “Resource Pools”. Those resource pools are mostly used for capping resources. In kubernetes, we have something very similar. We call them “namespaces” and they could be also used to ensure resource quotas as we will see in the next section. They are most commonly used, however, as a mean of multi-tenancy across applications (or users if you are using shared K8s clusters). This is also one of the ways where we can do security segmentation with NSX-T across those namespaces as we will see in future posts.

Resource Management

As I mentioned in the previous section, namespaces in kubernetes are commonly used as a mean of segmentation. The other usecase for it is also resource allocations and it is referred to as “Resource Quotas”. As we saw also from previous sections, the definition of that is through YAML configuration files were we declare the desirted state. In vSphere, we simily define this from the Resource Pools settings as you see in the screenshot below.

Workloads Identification

This is fairly easy and almost identical between vSphere and Kubernetes. In the former, we use the concepts of tags to identify (or group) similar workloads, while in the latter we use the term “labels” to do this. In Kubernete’s case, this is mandatory and not optional where we use things like “selectors” to identify our containers and apply the different configurations for them.

Redundancy

Now to the real fun. If you were/are a big fan of vSphere FT like me, you will love this feature in kubernetes despite some differences in the two technologies. In vSphere this is a VM with a running instance and a shadow one in a lock-step. We record the instructions from the running instance, and we replay it in the shadow VM. In the running instance went down, the shadow VM kicks in immediately. vSphere then tries to find another ESXi host to bring another shadow instance to maintain the same redundancy. In Kubrnetes, we have something very similar here. The ReplicaSets are a number you specify to run multiple instances of a pod. If one pod goes down, the other instances are available to serve the traffic. In the same time, k8s will try to bring up a substitute for that pod on any available node to maintain the desired state of the config. The major difference as you may have already noticed is that in the case of k8s, the pod instances are always live and service traffic. They are not shadowed workloads.

Load Balancing

While this might not be a built-in feature in vSphere, it is still a very, very common thing to run load-balancers on that platform. In the vSphere world, we have either virtual or physical load-balancers to distribute the network traffic across multiple VMs. This could be running is many different configuration modes, but let’s assume here that we are referring to the one-armed configuration. In this case, you are load-balancing your network traffic east-west to your VMs.

Similarly in Kubernetes, we have the concepts of “Services”. A k8s Service could be also used in different configuration modes, but let’s pick the “ClusterIP” configuration here to compare to the one-armed LB. In this case, our k8s Service will have a virtual IP that is always static and do not change. This VIP will distribute the traffic across multiple pods. This is especially important in the Kubernetes world were pods are ephemeral by nature where you lose the pod IP address the moment it dies or gets deleted. So you have to always be able to maintain a static VIP.

As I mentioned, the Services have many other configurations like: “NodePort” where you basically assign a port on the node leverl and then do a port-address-translation down to the pods. There is also the “LoadBalancer” where you spin up an LB instance from a 3rd-party or a cloud provider.

There is another very important load-balancing mechanism in Kuberentes called “Ingress Controller”. You can think of this like an in-line application load-balancer. The core concept behind this is that an ingress-controller (in a form of pod) would be spun up with an externally visible IP address, and that IP could have something like a wildcard DNS record. When traffic hits an ingress-controller using the external IP, it will inspect the headers and determine through a set of rules you pre-define to which pod that hostname should belong. Example: sphinx-v1.esxcloud.net will be directed to the service “sphinx-svc-1”, while the sphinx-v2.esxcloud.net will be directed to the service “sphinx-svc2” and so on and so forth.

Storage & Networking

Storage and networking are very, very rich topics when it comes to kubernetes. It is almost impossible to talk briefly about those two topics in an introduction blog post, but you can be sure that I will be blogging soon, in detail, about the different concepts and options for each subject. For now, let’s quickly examine how the networking stack works in kubernetes since we will have a dependency in the later section on it.

Kubernetes has different networking “plugins” that you can use to set up your nodes and pods network. One of the common plugins is “kubenet” which is currently used on mega-clouds like GCP and AWS. I am going to talk briefly here about the GCP implementation and then show you a practical example later to examine this your self on GKE.

This might look a bit too much to take in from a first glance, but hopefully, you will be able to make sense of all that by the end of this blog post. Firstly, we see that we have two kubernetes nodes here, node 1 and node (m). Each node has an eth0 interface like any Linux machine, and that interface has an IP address to the external world, in our case here on subnet 10.140.0.0/24. The upstream L3 device is acting as our default gateway to route our traffic. This could be a L3 switch in your datacenter or a VPC router in a public cloud like GCP as we will see later. So far so good?

Next, we see inside the node that we have a cbr0 bridge interface. That interface has the default gateway of an IP subnet 10.40.1.0/24 in case of node 1. This subnet gets assigned by kubernetes to each node. The latter usually get a /24 subnet but you can of course control that in case of NSX-T as we will see in future posts. For now, this subnet is the one that we will allocate the pods IP addresses from. So any pod inside node 1 will get an IP address from this subnet rage, in our case here Pod 1 has an IP address of 10.40.1.10. You notice however that this pod has two nested containers within. Remember we said that a pod can run one or more containers that are tightly coupled together in terms of functionalities. This is what we see here. Container 1 is listening to port 80, while container 2 is listening to port 90. Both containers share the same IP address of 10.40.1.10 but they do not own that networking namespace. Alright, so who owns this networking stack then? It’s actually a special container that we call the “Pause Container”. You see it in the diagram as the interface of that pod to the outer world. So it owns this networking stack, including the IP address itself 10.40.1.10 and of course it forwards the traffic to container 1 using port 80, while it forwards the traffic to container 2 using port 90.

Now you should be asking, how is traffic forwarded then to the external world? You see that we have a standard Linux IP forwarding enabled here to forward the traffic from cbr0 to eth0. This is great, but then how can the L3 device know how to in turn forward this to the destination? We do not have dynamic routing here to advertise this network in this particular example. And so, this is why we need to have some kind of “static routes” on that L3 device to know that in order to reach subnet 10.40.1.0/24, your entry point is the external IP of node 1 (10.140.0.11) and in order to reach subnet 10.40.2.0/24, your next hope is node (m) with the IP address 10.140.0.12.

This great but of course must be thinking that this is a very unpractical way to manage your networks. This would be an absolute nightmare for network administrators to maintain all those routes as you scale with your cluster. And you right, and this is why we need some kind of solutions like the CNI (container network plugin) in Kuberentes to use a networking mechanism to manage this for you. NSX-T is one of those solutions with a very, very powerful design for both the networking and security stacks.

Remember, we are examining here the kubenet plugin, not CNI. The former is what Google Container Engine (GKE) uses and the way they do this is quite fascinating as it’s completely programmable and automated on their cloud. So those subnet allocations and associated routes are taken care of by GCP for you as we will see in the next part.

What’s Next?

It is time now to get your hands down and dirty with kuberentes. You can start exploring all those concepts in a practical way by checking out the next part of this article at this link: http://www.hanymichaels.com/2017/10/18/kubernetes-introduction-for-vmware-users-part-2-the-practice/

--

--