Run Red Hat OpenShift Version 4 on Amazon Web Services (AWS) cloud platform — Part 1

Gang Chen
7 min readNov 8, 2019

--

By Gang Chen, Budi Darmawan and Jeffrey Kwong

Kubernetes has become the de-facto standard to manage container-based workload. Red Hat OpenShift builds on top of the Kubernetes orchestrator and provides a hybrid cloud platform for developing and running container workload on any cloud provider including AWS.

To achieve the best outcome of running OpenShift on AWS, proper planning and operation best practices should be applied, especially with the OpenShift 4.x releases that promote concepts like Operators and immutable infrastructure. In this two-part blog series, I am going to explain the thought process of planning a highly-available OpenShift deployment on AWS and a detailed walkthrough of the installation process.

OpenShift on AWS Topology

I recommend the following infrastructure topology when running OpenShift on AWS. We’ll then dive into the detail of how we came up with the solution in the planning decisions section.

Figure 1: OpenShift topology on AWS

Planning Decisions

A solid production deployment depends on carefully reviewing some of the key decisions to cover aspects including high availability, security, automation, DevOps, monitoring, etc. I’ll go through some of the key decisions to help later build a foundation of the OpenShift deployment on AWS.

I will introduce an asset we built to automate the OpenShift installation on AWS in the later section. It contains the infrastructure configuration detail. Just make sure you understand the key requirement and design decisions in the following section.

Installer-provisioned vs. User-provisioned infrastructure

OpenShift 4 completely changed the installation experience. You can install OpenShift with either an installer-provisioned infrastructure (IPI) or user-provisioned infrastructure (UPI). I strongly recommend using UPI for your production installation because the IPI makes too many infrastructure assumptions (networking, security) which you wouldn’t easily get from an AWS admin. This blog is based on UPI.

AWS VPC

Amazon Virtual Private Cloud (VPC) lets you provision a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that you define.

You should design your OpenShift VPC with public and private subnets. I suggest putting all OpenShift nodes (EC2 instances) including masters and workers in private subnets across multiple availability zones (AZ). This approach allows us to put the critical OpenShift components behind the firewall in a private network. OpenShift installation does require access to the internet in order to:

· Access the Red Hat OpenShift Cluster management to download installation program and perform subscription management and entitlement

· Access Quay.io to obtain the packages that are required to install your cluster

· Obtain the package that is required to perform cluster update

You can use AWS NAT gateway to allow the OpenShift components in private subnets to access the Internet. OpenShift 4.2 introduced the disconnected (air-gapped0 installation. In that case, you don’t need to assign NAT gateways at all.

If you want to expose your OpenShift hosted application to the Internet, you can instruct the OpenShift Ingress-Controller (router) to provision a public load balancer (this is a classic load balancer provisioned by Kubernetes, but you can instruct OpenShift to provision a Network load balancer). This load balancer will be placed in the VPC public subnets to accept Internet traffic.

As an alternative design, you can use two VPCs with a second VPC acting as DMZ zone as shown below:

Figure 2: Alternative Infrastructure Topology

This VPS has a public and private subnet. This is where you deploy your bastion host (or installation automation node) to kick off the terraform installation or to ssh into the cluster nodes. OpenShift is installed into the private VPC with only private subnets. Any traffic in and out of this private network is through an AWS transit gateway with pre-defined routing tables. If you need to expose your OpenShift hosted application to the Internet, you will likely need an external load-balancer pointing to the OpenShift private zone application record hosted in AWS Route53. The key is to the AWS PrivateLink (you can read more from this AWS blog). So, Route 53 public record -> Public Application LB -> PrivateLink Endpoint Service -> Private VPC ELB for OpenShift.

VPC is the networking foundation of the OpenShift cluster. However, you need to be clear that OpenShift, based on Kubernetes, has its own networking layer. It uses OpenShift SDN which configures an overlay networking using Open vSwitch (OVS). VPC takes care of the host (VM) communication within a private network, while SDN handles the communication between OpenShift Pods (collection of containers made up of the Kubernetes operation unit). VPC and OpenShift SDN networking work together nicely to provide you a truly isolated private networking on top of AWS.

AWS EC2

OpenShift control plane (master) node must use Red Hat Enterprise Linux CoreOS (RHCOS) as the operating system. You can use RHCOS or RHEL for the worker nodes. I recommend using RHCOS which allows OpenShift to manage all aspects of the cluster machine including the operating system.

Figure 3: EC2 instance for OpenShift

AWS Load Balancer

To achieve HA, OpenShift leverages multiple nodes for all components and they are spread across multi-AZs. To access these components and applications deployed on worker nodes, you’ll need to use a load balancer. You need at least 2 load balancers to distribute traffic for OpenShift control plane API requests and Route/Ingress resources. Here is the detail:

  • Control Plane LoadBalancer — An internal (private network) network load balancer. You need to create this load balancer before OpenShift installation.
  • listen on 6443, forward to master nodes port 6443
  • listen on 22623, forward to master nodes port 22623
  • LoadBalancer for OpenShift route and Ingress controller. This load balancer will be created by OpenShift during the installation time.
  • listen on port 80, forward to infra nodes port 3xxxx (http)
  • listen on port 443, forward to infra nodes port 3xxxx(https)

Both Load Balancers are behind the DMZ in the private VPC with the application load balancer mapped by the Route 53 private zone A record. If you need to expose OpenShift application to internet access, I suggest that you create another Load Balancer (application or network load balancer) in the DMZ VPC. This internet-facing load balancer will receive traffic from the Route 53 public zone for application, then forward the traffic to the private zone for application.

DNS with Route53

OpenShift relies on DNS resolution and name lookup for both internal and external connections to the cluster. So you need to have a DNS domain name registered either through Route 53 or some other domain registrar. But you need to create two Route 53 zones prior to the OpenShift installation.

Assuming your domain name is “kpak.io”

- Route 53 Public Zone (with NS and SOA records)

- Route 53 Private Zone

The private zone will have 2 records (A record) resolving to the OpenShift internal control load balancer mentioned above with DNS name of:

- api-int.$(CLUSTERID).kpak.io

- api.$(CLUSTERID).kpak.io

During the installation time, the OpenShift installer will update the private zone DNS record to resolve the name lookup to the Ingress (application) load balancer mentioned earlier.

Users, Roles and Security Group

You will need an AWS IAM user to install OpenShift. Red Hat document provides detail permissions for the IAM role. It is almost as granting the “AdministratorAccess” for the installer, which I am a little bit of concerned. But again, this is only used during the first installation time.

You also need 3 IAM roles (AWS::IAM::Role) 3 instance profiles (AWS::IAM::InstanceProfile) for each role. They will be used by the 3 types of OpenShift nodes (EC2 instance):

- Boostrap

- Master

- Worker

Most of these roles require permission for EC2 and S3 access.

OpenShift cluster requires the security groups to access or communicate with the master and worker nodes. You will need 3 security groups:

- MasterSecurityGroup

- WorkerSecurityGroup

S3 Buckets

OpenShift 4 cluster provision is started by a bootstrap node (EC2 instance running RHCOS) which has to fetch a bootstrap.ign file during the boot time. (I will explain the installation process in detail in the second blog). Red Hat suggests uploading this generated ignition file to an AWS S3 bucket before the installation. Thus, you do need an S3 bucket and the earlier bootstrap instance profile has read access to this bucket.

Automate, automate and automate

OpenShift version 4 has simplified Kubernetes and other Operating system installation with RHCOS and Operators. On AWS, we further enhanced the automation using terraform. The result is a fully automated installation approach that takes care of the user-provisioned infrastructure (such as VPC, DNS, ELB) as well as the OpenShift installation and basic post-installation configuration.

What’s next

In coming blogs, my colleagues Budi and I will do a deep dive into the actual OpenShift 4 installation on AWS with terraform automation.

Stay tuned.

Bring your plan to the IBM Garage.
Are you ready to learn more about working with the IBM Garage? We’re here to help. Contact us today to schedule time to speak with a Garage expert about your next big idea. Learn about our IBM Garage Method, the design, development and startup communities we work in, and the deep expertise and capabilities we bring to the table.

Schedule a no-charge visit with the IBM Garage.

--

--