Deploying an OKD Cluster

OKD, previously known as OpenShift Origin, is the upstream community project for Red Hat OpenShift. In general, people will pay Red Hat for either the enterprise or cloud versions of the application. However, if you don’t want to pay and have your own hardware available to install it on, you can deploy your own cluster. This is very useful if you have multiple users who want to spin up projects. At Computer Science House we had this exact need and the resources to build it on, so I helped build our new cluster.


What is OKD and why would you want it? OKD is a project that builds heavily on Kubernetes and provides a large series of management tools for any projects you may have on the cluster. This even includes a very attractive UI that makes spinning up a project from scratch take just a matter of minutes from creation to deployment. Under the hood it also helps administrators by providing a command-line interface and administrative UIs to make typically tedious tasks a few clicks/commands away.

At Computer Science House we have over 100 active members, all typically wanting to spin up projects frequently. This meant any time someone wanted to work on a project, we had to allocate them a virtual machine and the resources that came with that. This was inefficient both in time for our administrators and in resources as we would typically allocate more than was needed to run the project. The solution we arrived at was to stand up an OpenShift Origin cluster and allow members to simply just create projects at will and monitor the usage. One of our members (credit to Steven Mirabito) set up a Openshift Origin 3.7 cluster for us. This worked really well — too well. The cluster was incredibly popular and over time it started to run out of resources and degrade slightly from updates throughout the two years it stood.

This meant when we went to look at upgrading again, I decided to take up the task to spin up a new one from scratch. This time on the latest version, with more resources to ensure we don’t run out again, with more storage, and with the knowledge learned from the old cluster.

How to begin

Before you can really begin you need to define how large your cluster is going to be. You need at least 3–4 machines (VMs or physical) for a full cluster.

  • A load balancer (lb)
  • A master (master)
  • An etcd node (etcd)
  • A compute and/or infrastructure node (node)

In the case of our cluster, I built up:

  • 1 load balancer
  • 5 masters
  • 5 etcd nodes
  • 2 infrastructure nodes
  • 13 compute nodes

These nodes can be either Red Hat Enterprise Linux or CentOS and all have some minimum specifications that must be met. Those are defined in the OKD documentation. From there, there’s a lot of system prerequisites that need to be met. It’s near impossible for me to define them all, so luckily the documentation provides them.

In general, when in doubt, refer to the guide and read carefully. Once you have your nodes set up with the required specs, it’s time to define your cluster and start installing.

Defining the Cluster

I used the Ansible installation method for the cluster. Despite running into many issues, it was the most reliable method for us. If you want to try out the Atomic Host method, go for it, however a lot of this post will not apply. The most important file to the ansible setup lives at /etc/ansible/hosts. This is your Ansible host inventory, and it’s where you define how your cluster is set up. All the Ansible scripts that build your cluster pull from this file, and it essentially defines environment variables for the installation. You can read about this in more detail in the documentation.

Nodes, etcd, and Masters

The first important step is defining your compute nodes, etcd nodes, and masters. The inventory is preloaded with some children defined, which defines the type of nodes that can exist in the inventory.


You then need to define your node groups, which will match up to Kubernetes node groups. Below is our configuration where we just define a master, infrastructure node, and compute node:

openshift_node_groups=[{'name': 'node-master', 'labels': ['']}, {'name': 'node-infra', 'labels': ['']}, {'name': 'node-compute', 'labels': ['']}]

Finally, we define the host groups which is what Ansible will use to connect to the boxes to conduct the full installation. The [01:05] notation is super useful in defining sequential numbers of servers to loop through (remember servers are cattle not pets).

# Host group for masters

# Host group for etcd

# Specify load balancer host

# Host group for NFS

# Host group for nodes
okd-master[01:05] openshift_node_group_name='node-master'
okd-node[01:02] openshift_node_group_name='node-infra'
okd-node[03:15] openshift_node_group_name='node-compute'

Additional Cluster Behavior

Now that you’ve defined all the systems that will be your cluster, you may want to define more details about it. Maybe you have a wildcard SSL certificate that you want to use, maybe you want to set up an authentication provider, or maybe you want to set up cluster metrics — all of these things also get defined in the inventory. I’m sure there are plenty of additional variables you can define, but those are just a few of the options I wanted to define for our cluster. There is fantastic documentation on both cluster metrics and aggregated logging and adding them to your cluster.

Running Ansible

Once you’ve gotten your inventory file setup, you’re ready to run Ansible and be up and running, right? Well, sorta. This part is where the majority of the time comes in. Ansible scripts for setting up OpenShift/OKD are… less than perfect. Sometimes they might even merge broken changes into the release branch and ignore your PR. 🙃 However, once you finally get it to run, it certainly saves you countless hours on the setup. For those planning to run the setup on RHEL7 boxes, make sure you have optional and additional repos enabled before running Ansible, it also wouldn’t hurt to run yum update before letting the scripts run.

Once you have prepared all of the nodes as well as you think you can, clone the Ansible repo, checkout the 3.11 branch, run the prerequisite and deploy scripts, then wait... A long time.

git clone
checkout release-3.11
ansible-playbook -i /etc/ansible/hosts ~/openshift-ansible/playbooks/prerequisites.yml
ansible-playbook -i /etc/ansible/hosts ~/openshift-ansible/playbooks/deploy_cluster.yml

There is likely going to be a step at which the script will fail and require manual intervention. There’s only so much I could anticipate during the setup process; some errors were due to the scripts themselves, others were due to version mismatches or missing repos on the RHEL7 nodes in question. The scripts don’t tend to tell you in detail what failed or why, so you’ll need to spend some time investigating. I learned a lot about package managers, DNS, SSL, and Linux in general from this process. A few times there were even errors from Kubernetes itself that required a decent amount of research, and frequently dealing with the exact solution I needed being behind a Red Hat knowledgebase paywall.

The worst part, and the most time consuming, is the fact that the Ansible playbooks don’t let you pick back up where it failed. You will have to run it from the beginning many, many times. You may also need to run the uninstall playbook at some point if you encounter a lot of errors to bring you back to a nearly clean state to start from again (hopefully now with most of the issues resolved). When in doubt, Google around, or even open up issues on the Ansible repo. It’s going to likely be a lengthy and difficult process, but not impossible.

Final Setup

Once the Ansible script is finished, Openshift should now be deployed! You should be able to visit your cluster at its outward facing domain on port 8443 in a browser. You’ll be greeted with an interface much like below (If you set up OpenID Connect or some from of SSO, you should be prompted to log in first).

You may have a few less catalog options, and likely no projects (certainly not the 107 our current cluster has 😬). But, you’ll surely fill it up soon enough.

Especially in our case where we had an existing cluster, a big part of my job was to not only stack up the new one, but to import all the old projects into the new cluster for seamless transfer with minimal downtime.

Deploying a Router

Depending on your use case, you may opt to set up any number of the router options that OKD provides. For our case, we defined CNAMEs for all services running on the cluster in our DNS stack to point to the cluster. We then wanted to set up a high availability router with IP failover to handle traffic to the cluster. The provided documentation more or less goes over everything you might need to know, but can certainly be a challenge to set up depending on your networking setup. However, I imagine for a lot of use cases (especially a test or demo cluster!) the default HAProxy router is more than good enough.

Using the Cluster

Now that you’ve successfully deployed your OKD cluster, it’s time to use it to deploy some projects. OKD, although a little difficult to set up, is immensely useful from both a usability standpoint and from an infrastructure standpoint. You should be able to trust other engineers (or in our case, students) to deploy their own projects to OKD with minimal or even no intervention from system administrators/devops, and enjoy the free time you have from not manually provisioning virtual machines.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Devin Matté

Software Engineering Student at Rochester Institute of Technology with a focus on Full Stack Web Development and DevOps