Installing a Highly-Available OpenShift Origin Cluster

Adilson Cesar
6 min readNov 12, 2017

--

This article proposes a reference architecture for a Highly Available installation of OpenShift Origin. We will outline the architecture of such an installation and walk through the installation process. The intention of this process is to perform iterative installations of the OpenShift Origin cluster.

Cluster Design & Architecture

Preparation…

Provision Servers

01 Master (Openshift Web Console)
01 Infra (Router, Registry, Metrics, EFK)
01 Node (Application node POD)
Centos 7 x86_64 with Minimal Installation.

Each of these servers should be provisioned with an SSH public key which can be used to access all hosts from the Ansible

For next step, you need to install docker service on all nodes (Master, Infra, Node)

# yum install -y docker
# systemctl start docker.service

Docker Storage Setup

During the Provision Servers step of this guide, we provisioned all of our nodes (including the master) with docker volumes attached as /dev/sdb. We’ll now install and configure docker to use that volume for all local docker storage.

Running a large number of containers in production requires a lot of storage space. Additionally, creating and running containers requires the underlying storage drivers to be configured to use the most performant options. The default storage options for Docker-formatted containers vary between the different systems and in some cases they need to be changed. A default installation of RHEL uses loopback devices, whereas RHEL Atomic Host has LVM thin pools created during installation. However, using the loopback option is not recommended for production systems.

Creating a new VG (Master, Infra, Node)

# vgcreate vgdocker /dev/sdb

Editing /etc/sysconfig/docker-storage-setup file and add LV and VG names

# cat /etc/sysconfig/docker-storage-setup
# Edit this file to override any configuration options specified in
# /usr/share/container-storage-setup/container-storage-setup.
#
# For more details refer to “man container-storage-setup”
CONTAINER_THINPOOL=docker
VG=vgdocker

Restarting Docker and Docker Storage Setup Services

# systemctl stop docker docker-storage-setup
# rm -rf /var/lib/docker/*
# systemctl start docker docker-storage-setup

New logical volume will be created automatically. More information, check it out How to use the Device Mapper storage driver.

# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
root centos -wi-ao — — <8.00g
swap centos -wi-ao — — 1.00g
docker vgdocker twi-a-t — — 15.91g 0.12 0.10

Preparing the Installer…

OpenShift uses Ansible as it’s installation & configuration manager. As we walk through design decisions, we can start capturing this information in an ini style config file that’s referred to as the ansible inventory file. To start, we’ll establish a project skeleton for storing this file and begin populating information specific to the cluster I did:

More information about each parameter

*** All of the hosts in the cluster need to be resolveable via DNS. Additionally if using a control node to serve as the ansible installer it too should be able to resolve all hosts in your cluster.

In an HA cluster there should also be two DNS names for the Load Balanced IP address that points to the 1 master server for access to the API, CLI and Console services. One of these names is the public name that users will use to log into the cluster. The other is an internal name that will be used by internal components within the cluster to talk back to the master. These values should also resolve, and will be placed in the ansible Hosts file for the variables.

$TTL 86400
@ IN SOA xxxx.xxxxxx.xxx. xxxx.xxxxxx.xxx. (
2017010101; Serial
3600 ;Refresh
1800 ;Retry
604800 ;Expire
86400 ;Minimun TTL
)
@ IN NS xxxx.xxxxxx.xxx
@ IN A XXX.XXX.XXX.XXX
oc IN A XXX.XXX.XXX.XXX
master IN A XXX.XXX.XXX.XXX
infra IN A XXX.XXX.XXX.XXX
node1 IN A XXX.XXX.XXX.XXX
*.apps.cirrus.io IN A XXX.XXX.XXX.XXX

Preparing for Install

At this point in the process we are ready to prepare our hosts for install. The following sections guide us through this process.

# ansible-playbook -i hosts /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml

The install will run for 15–20 minutes. Good time for a coffee break.

Few minutes after. Voi-là!

Openshift Web Console (Master)

Authentication…

For the initial installation we are going to simply use htpasswd for simple authentication and seed it with a couple of sample users to allow us to login to the OpenShift Console and validate the installation.

# htpasswd -c /etc/origin/htpasswd admin
New password:
Re-type new password:
Adding password for user admin

and set cluster-role rights to admin user have access all projects in the cluster.

# oadm policy add-cluster-role-to-user cluster-admin admin
cluster role “cluster-admin” added: “admin”

Done!

By default Project. We are able to see Registry Service already "containerized"

Registry Service with router configured

So, you can access by https://registry-console-default.apps.cirrus.io URL provided by Router Service.

Registry (Infra node)

Validating an OpenShift Install…

After having gone through the process of building an OpenShift environment

Validate Nodes

# oc get nodes
NAME STATUS AGE VERSION
infra.cirrus.io Ready 1d v1.6.1+5115d708d7
master.cirrus.io Ready 1d v1.6.1+5115d708d7
node1.cirrus.io Ready 1d v1.6.1+5115d708d7

Check the output to ensure that:

  • All expected hosts (masters and nodes) are listed and show as Ready in the Status field
  • All masters show as unschedulable
  • All labels that were listed in the ansible inventory files are accurate

Validate Status of Default Project

The oc status command is helpful to validate that a namespace is in the state that you expect it to be in. This is especially helpful after doing an install to, at a high level, check that all of the supporting services and pods that you expect to exist actually do. At minimum, you should see the following after a successful install:

An example of a healthy output might look like:

# oc status
In project default on server https://master.cirrus.io
https://docker-registry-default.apps.cirrus.io (passthrough) (svc/docker-registry)
dc/docker-registry deploys docker.io/openshift/origin-docker-registry:v3.6.1
deployment #1 deployed 26 hours ago - 1 pod
svc/kubernetes - 172.30.0.1 ports 443, 53->8053, 53->8053https://registry-console-default.apps.cirrus.io (passthrough) (svc/registry-console)
dc/registry-console deploys docker.io/cockpit/kubernetes:latest
deployment #1 deployed 26 hours ago - 1 pod
svc/router - 172.30.50.237 ports 80, 443, 1936
dc/router deploys docker.io/openshift/origin-haproxy-router:v3.6.1
deployment #1 deployed 26 hours ago - 1 pod
View details with 'oc describe <resource>/<name>' or list everything with 'oc get all'.

Run Diagnostics

OpenShift provides an additional CLI tool that can perform more fine grained diagnostics, including validating that services can see each other, than certificates are valid, and much more. The output of a diagnostics run can be quite verbose, but will include a final report of Errors and Warnings at the end. If there are errors or warnings, you may want to go back to them and validate that they are not errors and warnings for any critical services.

Not all errors or warnings warrant action. The diagnostics check will additionally examine all deployed services and report anything out of the ordinary. This could include apps that may have been misconfigured by a developer, and would not necessarily warrant administrative intervention.

# oadm diagnostics

Uninstalling OpenShift Origin

You can uninstall OpenShift Origin hosts in your cluster by running the uninstall.yml playbook. This playbook deletes OpenShift Origin content installed by Ansible, including:

  • Configuration
  • Containers
  • Default templates and image streams
  • Images
  • RPM packages

The playbook will delete content for any hosts defined in the inventory file that you specify when running the playbook. If you want to uninstall OpenShift Origin across all hosts in your cluster, run the playbook using the inventory file you used when installing OpenShift Origin initially or ran most recently:

# ansible-playbook -i hosts /usr/share/ansible/openshift-ansible/playbooks/adhoc/uninstall.yml

Enjoy your Cluster :)

--

--

Adilson Cesar

I design, implement and support Linux Data Centers for telecommunications and finance companies.