Installing a Highly-Available OpenShift Origin Cluster
This article proposes a reference architecture for a Highly Available installation of OpenShift Origin. We will outline the architecture of such an installation and walk through the installation process. The intention of this process is to perform iterative installations of the OpenShift Origin cluster.
Preparation…
Provision Servers
01 Master (Openshift Web Console)
01 Infra (Router, Registry, Metrics, EFK)
01 Node (Application node POD)
Centos 7 x86_64 with Minimal Installation.
Each of these servers should be provisioned with an SSH public key which can be used to access all hosts from the Ansible
For next step, you need to install docker service on all nodes (Master, Infra, Node)
# yum install -y docker
# systemctl start docker.service
Docker Storage Setup
During the Provision Servers step of this guide, we provisioned all of our nodes (including the master) with docker volumes attached as /dev/sdb
. We’ll now install and configure docker to use that volume for all local docker storage.
Running a large number of containers in production requires a lot of storage space. Additionally, creating and running containers requires the underlying storage drivers to be configured to use the most performant options. The default storage options for Docker-formatted containers vary between the different systems and in some cases they need to be changed. A default installation of RHEL uses loopback devices, whereas RHEL Atomic Host has LVM thin pools created during installation. However, using the loopback option is not recommended for production systems.
Creating a new VG (Master, Infra, Node)
# vgcreate vgdocker /dev/sdb
Editing /etc/sysconfig/docker-storage-setup file and add LV and VG names
# cat /etc/sysconfig/docker-storage-setup
# Edit this file to override any configuration options specified in
# /usr/share/container-storage-setup/container-storage-setup.
#
# For more details refer to “man container-storage-setup”
CONTAINER_THINPOOL=docker
VG=vgdocker
Restarting Docker and Docker Storage Setup Services
# systemctl stop docker docker-storage-setup
# rm -rf /var/lib/docker/*
# systemctl start docker docker-storage-setup
New logical volume will be created automatically. More information, check it out How to use the Device Mapper storage driver.
# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
root centos -wi-ao — — <8.00g
swap centos -wi-ao — — 1.00g
docker vgdocker twi-a-t — — 15.91g 0.12 0.10
Preparing the Installer…
OpenShift uses Ansible as it’s installation & configuration manager. As we walk through design decisions, we can start capturing this information in an ini style config file that’s referred to as the ansible inventory file. To start, we’ll establish a project skeleton for storing this file and begin populating information specific to the cluster I did:
More information about each parameter
*** All of the hosts in the cluster need to be resolveable via DNS. Additionally if using a control node to serve as the ansible installer it too should be able to resolve all hosts in your cluster.
In an HA cluster there should also be two DNS names for the Load Balanced IP address that points to the 1 master server for access to the API, CLI and Console services. One of these names is the public name that users will use to log into the cluster. The other is an internal name that will be used by internal components within the cluster to talk back to the master. These values should also resolve, and will be placed in the ansible Hosts file for the variables.
$TTL 86400
@ IN SOA xxxx.xxxxxx.xxx. xxxx.xxxxxx.xxx. (
2017010101; Serial
3600 ;Refresh
1800 ;Retry
604800 ;Expire
86400 ;Minimun TTL
)@ IN NS xxxx.xxxxxx.xxx
@ IN A XXX.XXX.XXX.XXX
oc IN A XXX.XXX.XXX.XXXmaster IN A XXX.XXX.XXX.XXX
infra IN A XXX.XXX.XXX.XXX
node1 IN A XXX.XXX.XXX.XXX*.apps.cirrus.io IN A XXX.XXX.XXX.XXX
Preparing for Install
At this point in the process we are ready to prepare our hosts for install. The following sections guide us through this process.
# ansible-playbook -i hosts /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml
The install will run for 15–20 minutes. Good time for a coffee break.
Few minutes after. Voi-là!
Authentication…
For the initial installation we are going to simply use htpasswd for simple authentication and seed it with a couple of sample users to allow us to login to the OpenShift Console and validate the installation.
# htpasswd -c /etc/origin/htpasswd admin
New password:
Re-type new password:
Adding password for user admin
and set cluster-role rights to admin user have access all projects in the cluster.
# oadm policy add-cluster-role-to-user cluster-admin admin
cluster role “cluster-admin” added: “admin”
Done!
By default Project. We are able to see Registry Service already "containerized"
So, you can access by https://registry-console-default.apps.cirrus.io URL provided by Router Service.
Validating an OpenShift Install…
After having gone through the process of building an OpenShift environment
Validate Nodes
# oc get nodes
NAME STATUS AGE VERSION
infra.cirrus.io Ready 1d v1.6.1+5115d708d7
master.cirrus.io Ready 1d v1.6.1+5115d708d7
node1.cirrus.io Ready 1d v1.6.1+5115d708d7
Check the output to ensure that:
- All expected hosts (masters and nodes) are listed and show as
Ready
in theStatus
field - All masters show as unschedulable
- All labels that were listed in the ansible inventory files are accurate
Validate Status of Default Project
The oc status
command is helpful to validate that a namespace is in the state that you expect it to be in. This is especially helpful after doing an install to, at a high level, check that all of the supporting services and pods that you expect to exist actually do. At minimum, you should see the following after a successful install:
An example of a healthy output might look like:
# oc status
In project default on server https://master.cirrus.iohttps://docker-registry-default.apps.cirrus.io (passthrough) (svc/docker-registry)
dc/docker-registry deploys docker.io/openshift/origin-docker-registry:v3.6.1
deployment #1 deployed 26 hours ago - 1 podsvc/kubernetes - 172.30.0.1 ports 443, 53->8053, 53->8053https://registry-console-default.apps.cirrus.io (passthrough) (svc/registry-console)
dc/registry-console deploys docker.io/cockpit/kubernetes:latest
deployment #1 deployed 26 hours ago - 1 podsvc/router - 172.30.50.237 ports 80, 443, 1936
dc/router deploys docker.io/openshift/origin-haproxy-router:v3.6.1
deployment #1 deployed 26 hours ago - 1 podView details with 'oc describe <resource>/<name>' or list everything with 'oc get all'.
Run Diagnostics
OpenShift provides an additional CLI tool that can perform more fine grained diagnostics, including validating that services can see each other, than certificates are valid, and much more. The output of a diagnostics run can be quite verbose, but will include a final report of Errors
and Warnings
at the end. If there are errors or warnings, you may want to go back to them and validate that they are not errors and warnings for any critical services.
Not all errors or warnings warrant action. The diagnostics check will additionally examine all deployed services and report anything out of the ordinary. This could include apps that may have been misconfigured by a developer, and would not necessarily warrant administrative intervention.
# oadm diagnostics
Uninstalling OpenShift Origin
You can uninstall OpenShift Origin hosts in your cluster by running the uninstall.yml playbook. This playbook deletes OpenShift Origin content installed by Ansible, including:
- Configuration
- Containers
- Default templates and image streams
- Images
- RPM packages
The playbook will delete content for any hosts defined in the inventory file that you specify when running the playbook. If you want to uninstall OpenShift Origin across all hosts in your cluster, run the playbook using the inventory file you used when installing OpenShift Origin initially or ran most recently:
# ansible-playbook -i hosts /usr/share/ansible/openshift-ansible/playbooks/adhoc/uninstall.yml
Enjoy your Cluster :)