“A drone shot of colorful shipping containers in a shipping terminal” by chuttersnap on Unsplash

Kubernetes on Vsphere (Packer - Part 1)

I used to work with really high level clouds especially gcloud. I have always been astonished by the power of such a platform: Three clicks and a half to deploy something that is perfectly working: robust, easy to use and manage.

On gcloud, getting a Kubernetes cluster takes less than 5 minutes

Last year, before starting Kuranda Labs and entering the healthcare space, I would never have imagined that one of the major pain point would be the Cloud. In France, if you want to run an application and especially to store data, you should not imagine one single second to be able to do it with Azure or AWS.

To be eligible, providers need to have data centers in France and obtain some kind of certification. Some have a partial level of certification so that they can only sell virtual machines. They sell it for more than 1k€ a month and if you want another VM, you should wait for an incredibly long period of 8 weeks: Let’s be serious.

We finally got our infrastrusture at OVH for about 1.5k€ a month which is also quite expensive for a startup: two hosts with 8 cores / 32 go of RAM both managed by a VMware Vsphere platform.

With Vsphere, 3 clicks and a half will get you almost nothing

Now that we have our infrastructure, we should be happy right? The truth is that we are far from getting our Kubernetes cluster healthy and running. Vsphere lets you manage VMs, storage, dhcp, etc… We basically need to do everything!

I am writing those articles to share my journey from the naked Vsphere to the healthy Kubernetes cluster perfectly running: Let’s make zero to one.


Why using Kubernetes?

This question might sound crazy for people familiar with the deployment of containerized applications but this was the starting point of our reflexion.

Google has created this library from their massive experience in the deployment and the scaling of applications. They have open-sourced it so that the developer community can make the most of it (also to sell their cloud through the perfect integration of Kubernetes into gcloud :-) ). It allows developers to perfectly manage the workflow of containerized applications (deployments, scaling, rollouts, rollbacks, etc). It basically creates the perfect abstraction for us to manage our workloads in the cloud. Running such a cluster will allow us to forget about virtual machines and how we split workloads between our different physical machines.

Docker has also a similar library directly integrated into their engine which is named Docker Swarm. Despite Swarm’s perfect integration (Docker is the most popular tool to handle containers), Docker had to add support for Kubernetes because it was massively adopted by the community. We could not resist to such a powerful tool…

The problem here is that deploying a Kubernetes cluster to Vsphere is not that easy. I would say this is actually really painful. This serie of articles will present the different steps needed to do so:

  1. Create VM templates with packer (You are here)
  2. Create nodes with terraform
  3. Create clusters with kubeadm

Packer: the docker of VMs

We are not going into a lot of details about kubernetes here but one thing you should know is that the cluster is made of Nodes. A Node is basically a worker that will be used to schedule and run some computation units named Pods. What is important here is that it means we will need to create some VMs on our platform.

The dumbest way to do it would be to manually create a virtual machine directly on Vsphere, install libraries (needed to run the kubernetes), export it as a template, create from it as many VMs as nodes needed and finally start a cluster. The problem with this method is that it is not automated at all. I need to click on buttons to start the OS setup, load configuration files and scripts. I end up with a template that I cannot share and easily track with git. To sum it up, this approach is broken.

To build Docker images, we have one single source of truth: we use a Dockerfile to describe how should be built the image. We want exactly the same system for our VMs. We would be more than happy if this tool could also be deeply integrated into Vsphere so that we can directly build inside the platfrom. Well, it seems we are really lucky because packer is excatly doing this. From their documentation:

Packer is an open source tool for creating identical machine images for multiple platforms from a single source configuration

Packer overview

As we have seen, we get a single source of truth using packer. This is achieved thanks to a JSON file called a template. Because this is not a tutorial about packer, I will just present briefly the major components of this file as they are presented in the documentation:

  1. Builders: components of Packer that are able to create a machine image for a single platform. Builders read in some configuration and use that to run and generate a machine image.
  2. Provisioners: components of Packer that install and configure software within a running machine prior to that machine being turned into a static image. This is basically running after the OS finished to install.
  3. Post-processors: components of Packer that take the result of a builder or another post-processor and process that to create a new artifact.

Packer can read a template and create what they call artifacts that are compatible with multiple platforms. Those artifacts are basically a set of files composing a machine. Each platform has his own definition of a machine, which means different type of file: this is why the tool is so valuable. If you want to go deeper, you could also check vagrant which allows you to create a single unified environment to manage machines (single set of files for a platform-agnostic definition).

Install packages

You will first need to install packer in your system. I am running on macOS but packer is available for all major platforms. Just follow packer’s guide.

To be able to use the VMware integration, you should also download another tool depending on your system. For me, I need to get VMware Fusion. Do not worry, this tool can be used in the background during the process.

Building our template

  1. We define one single builder with the type vmware-iso because we are only going to provision our Vsphere. You could build for multiple platforms at the same time.
  2. We provide information about the ISO file to pull to build the machine. We are using Centos but you can choose the OS you want. packer will first check locally for the ISO and will download it if needed.
  3. We choose some SSH credentials and a timeout. You should go for a really high timeout because packer will try to connect to the machine with SSH right after the OS is installed to run the provisioners, which might take some time. 1 hour should be enough
  4. We specify an “http_directory” to expose one of our directory through the server packer will create during the process. Packer is able to query files in the folder during the build phase. This is where the tool is retrieving the seed file for our Centos 7.
  5. the headless mode decide whether VMware fusion is run in the background or not. Even if you set it to false, you will be provided with an URL to connect to the host.
  6. The “boot_command” is specifying the URL to query to retrieve the seed.
  7. The last part is not interesting, mainly about the hardware configuration.

This template is valid and you can check it with the following command to validate the syntax:

packer validate base-template.json

We also want to run some scripts after the OS is installed. Indeed, to run kubernetes we have to perform some basic tasks that you can find here. This can be done with Provisioners. We provide a list of scripts to be run through the SSH connection that I was talking about. You can check it on the gist above.

Besides, as explained before, packer is deeply integrated into the different platforms. This is the case with Vpshere: packer is providing a really powerful set of Post-processors to manipulate artifacts after the build phase. We can directly upload the artifact to the platform and turn it into a template. We modify our template to run the different processors:

Post-processors are receiving the output of the builders. Because we want the vsphere-template to receive the output of the vsphere post-processor, we nest it under a list of post-processors (as described here). Note that we are using the templating engine of packer to read from environment variables the different information about our Vsphere (credentials, etc).

The vsphere post-processor will upload the image to our platform. It uses the ovftool library to perform this task: just check the source in github to verify and see the logic of this post-processor. We actually installed the library from the VMware platform but we were bumping into a weird error. Using the ovftool from the VMware fusion we just downloaded made the trick. Just append it to your $PATH:

export PATH="/Applications/VMware Fusion.app/Contents/Library/VMware OVF Tool/:$PATH"

Restart your terminal and check the library is installed and in your $PATH:

ovftool --help

The vsphere-template will then tag the newly uploaded machine as a template. We have the final version of our template, it means we can start the build phase by running the following command:

packer build template.json

packer will parse the template and start the process. The seed will be queried and the OS installation will start automatically. Do not freak out if you are stuck at this step during long minutes, it might take some time.

As you can see, you can also directly connect to your machine using the URL provided in the stacktrace. You can download a VNC client and check that the installation is in progress. In this example, we needed to connect to 127.0.0.1 on the port 5904:

When the installation is done, you should see in the console that packer is running the provisioner scripts and try to upload to the Vsphere platform. Normally, when checking your Vsphere, you should see that a new template has been provisioned to your cloud.

If this is the case, it means you nailed it.


Wrapping up

We have created our single source of truth to automate the workflow of creating a virtual machine. Just like Docker, we can track versions of our machine with a simple Git repository. Besides, we deploy directly to our platform which is super convenient!


Our project

// Our source of truth
template.json
// Exposed as a server
http/
// The seed used to install the OS
ks.cfg
iso/
// OS
centos.iso
// Configuration scripts
scripts/
cleanup.sh
kubeadm.sh
vmtools.sh
zerodisk.sh

This is the seed we used (Got this one from the Vsphere Centos installer):