Autoscaling in Oracle Cloud and OKE (Oracle Container Engine) as load generator

Ali Mukadam
May 28, 2019 · 7 min read
Image for post
Image for post
Source: https://www.flickr.com/photos/vicpowles/38654098810/

In this post, we will look at Autoscaling in Oracle Cloud. From the horse’s mouth:

Autoscaling enables you to automatically adjust the number of Compute instances in an instance pool based on performance metrics such as CPU utilization. This helps you provide consistent performance for your end users during periods of high demand, and helps you reduce your costs during periods of low demand.

Let’s try this with an example:

  1. The application will be a simple Nginx on a compute
  2. Create an Instance Configuration of the compute. This will allow us to use a pre-defined configuration to use when creating compute instances as part of an instance pool.
  3. Create an Instance Pool based on the Instance Configuration and this will allow us to provision multiple instances with the same configuration.
  4. Attach a private Load Balancer to the instance pool. This will ensure that when more instances are added to the instance pool, they will also be automatically added to the Load Balancer’s backend set and traffic routed to the new instances when they are deemed to be in a healthy state.
  5. Create an Autoscaling Configuration and set the necessary policy to scale out and scale in.
  6. To generate the necessary amount of traffic, we will use OKE (Oracle Container Engine) and the jmeter-operator.

Let’s get started.

Clone the terraform-oci-oke project and keep the default topology (3), number of node pools (1) and number of worker nodes per subnet (1). We will scale them later. For now, we just need the basic infrastructure (VCN, gateways, bastion, subnets etc).

Set the following parameters:

newbits = {
"bastion" = "8"
"lb" = "8"
"app" = "8"
"workers" = "8"
}
subnets = {
"bastion" = "11"
"lb" = "12"
"app" = "13"
"plb" = "14"
"lb_ad1" = "32"
"lb_ad2" = "42"
"lb_ad3" = "52"
"workers_ad1" = "33"
"workers_ad2" = "43"
"workers_ad3" = "53"
}

Note that you do not need to use the same values as mine. You just need to ensure your subnets do not overlap.

Once the basic VCN and OKE has been provisioned, create an “app” subnet and a corresponding security list. The app subnet should be regional and private.

Image for post
Image for post
Ingress Rules for app subnet
Image for post
Image for post
Egress rules for app subnet
Image for post
Image for post
app subnet

Next, create a security list and a subnet for a Load Balancer. Note that this will be different from the load balancer security list and subnet created for OKE. The load balancer subnet should also be regional and private.

Image for post
Image for post
Ingress rules for Private Load Balancer Subnets

You can set the egress rules the same as for the app subnet.

Image for post
Image for post
Private Load Balancer subnet

Next, create a private Load Balancer and ensure you select your vcn created when you provisioned OKE and the private load balancer subnet you created above.

Image for post
Image for post
Load Balancer Health Check
Image for post
Image for post
Create an HTTP listener

Create a compute instance and pick the smallest available shape:

Image for post
Image for post

and ensure you add your ssh key and you assign it to the app subnet:

Image for post
Image for post

Click on Advanced Options and paste the following cloud-init script:

#cloud-configpackage_update: false
packages:
- nginx
runcmd:
- systemctl enable nginx
- systemctl start nginx
- firewall-offline-cmd --add-service=http
- systemctl restart firewalld

Once the instance is running, click on “Create instance configuration”:

Image for post
Image for post

Then, click on “Create Instance Pool” to create one:

Image for post
Image for post

You will be presented with the following screen. Enter 1 for the minimum number of instances, check the “Attach Load Balancer” checkbox and select your load balancer and the backend set. Set the port to 80 and under Availability Domain Selection 1, select your VCN and set the subnet to app.

Image for post
Image for post

Click on additional selection twice and select AD2 and AD3 for the additional selections. If your region has only 1 Availability Domain, you can skip this step.

Finally, create an Autoscaling Configuration from the instance pool:

Image for post
Image for post

Configure your Autoscaling configuration as follows and leave the cooldown period at its minimum of 300s:

Image for post
Image for post

and configure your autoscaling policy as follows:

Image for post
Image for post
Autoscaling policy

Set the minimum number of instances to 1, maximum to 6 and initial to 1. For the scaling rules, we will set out a ridiculously low thresholds so we can get the autoscaling event to trigger.

Your configuration is all done.

We now need to generate enough traffic to the instance to cause a CPU spike above 2% for 3 mins so an autoscaling event will occur and cause the instance pool to scale out and add at least 1 more instance. For that, we will use Apache JMeter, or more specifically, the JMeter Operator running in OKE.

First, scale out the OKE cluster and increase the number of workers per subnets to at least 10 and run terraform apply to make the chance. You can also do this in the OCI Console.

Login to the bastion and clone the JMeter Operator project:

git clone https://github.com/kubernauts/jmeter-operator
cd jmeter-operator

Edit the jmeter-deploy.yaml as follows:

apiVersion: loadtest.jmeter.com/v1alpha1                                                                                                                              
kind: Jmeter
metadata:
name: tqa-loadtest
namespace: tqa
spec:
# Add fields here
slave_size: 30
jmeter_master_image: kubernautslabs/jmeter_master:5.0
jmeter_slave_image: kubernautslabs/jmeter_slave:5.0
grafana_server_root: /
grafana_service_type: ClusterIP
grafana_image: grafana/grafana:5.2.0
influxdb_image: influxdb
grafana_install: "true"
grafana_reporter_install: "true"
grafana_reporter_image: kubernautslabs/jmeter-reporter:latest
influxdb_install: "true"

The fields I changed are highlighted in bold above. Ensure the slave_size matches the number of worker nodes.

Download the autoscaledemo.jmx:

curl -o autoscaledemo.jmx   https://raw.githubusercontent.com/hyder/okesamples/master/autoscaling/autoscaledemo.jmx

and change the BASE_URL_1 from 10.0.14.4 to the IP Address of the private load balancer:

<Arguments guiclass=”ArgumentsPanel” testclass=”Arguments” testname=”User Defined Variables” enabled=”true”>
<collectionProp name=”Arguments.arguments”>
<elementProp name=”BASE_URL_1" elementType=”Argument”>
<stringProp name=”Argument.name”>BASE_URL_1</stringProp>
<stringProp name=”Argument.value”>10.0.14.4</stringProp>
<stringProp name=”Argument.metadata”>=</stringProp>
</elementProp>
</collectionProp>
</Arguments>

Ensure all your worker nodes are active and follow the steps on the jmeter-operator page to install it.

At this point, your environment should be stable and look like this:

Image for post
Image for post
Stable pool
Image for post
Image for post
Stable Instance Configuration
Image for post
Image for post
Stable backend set
Image for post
Image for post
Metrics stable

Initialize your JMeter cluster and run a test:

./initialize_cluster.sh
./start-test.shEnter the Jmeter Namespace: tqa
Enter path to the jmx file autoscaledemo.jmx

Now, let JMeter run and check after 5 mins. The autoscaling event is triggered and an additional instance is created.

Image for post
Image for post
Instance Configuration with Target Count=2
Image for post
Image for post
2nd instance being provisioned

Eventually, things stabilize and the 2nd instance is provisioned:

Image for post
Image for post
Instance pool with 2 instances

and the 2nd instance is automatically registered as an additional backend with the load balancer:

Image for post
Image for post

When the test run has completed and the traffic stops, the average CPU utilization also decreases triggering an autoscaling event. This time, the instance count will be reduced:

Image for post
Image for post
Scaling Down
Image for post
Image for post
Instance Pool Scaling Down

And eventually, the instance is terminated:

Image for post
Image for post

Oracle Groundbreakers

Aggregation of articles from Oracle engineers…

Ali Mukadam

Written by

Oracle Groundbreakers

Aggregation of articles from Oracle engineers, Groundbreaker Ambassadors, ACEs, and the developer community on all things Oracle Cloud. The views expressed are those of the authors and not necessarily of Oracle.

Ali Mukadam

Written by

Oracle Groundbreakers

Aggregation of articles from Oracle engineers, Groundbreaker Ambassadors, ACEs, and the developer community on all things Oracle Cloud. The views expressed are those of the authors and not necessarily of Oracle.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store