Kubernetes: Nginx and Zero Downtime in Production

Codecademy
Codecademy Engineering
7 min readMar 18, 2020

When running an application in Kubernetes, it is important that the ingress controllers are able to run with zero downtime, especially in production. Otherwise, users will experience degraded service when controllers are restarting or redeploying, or when rolling node servers.

This article will explain how to set up an Nginx ingress controller in Kubernetes, with zero downtime.

The Problem

The official Nginx Ingress controller, by default, has interruptions in service when its pods are restarted or redeployed. This can result in dropped connections if the ingress pods are terminated for any reason. In production environments especially this is unacceptable. We need to have an nginx ingress controller that is always available to serve traffic.

Kubernetes and SIGTERM

When terminating a pod, Kubernetes will send a SIGTERM signal to the main process and wait for some predetermined amount of time for the pod to terminate (default is 30s). After that time has passed, Kubernetes will send SIGKILL, which causes the process to terminate immediately.

Documentation on pod termination can be found here.

Kubernetes expects pods to handle SIGTERM with a graceful shutdown. In reality, not all pods honor this expectation.

Nginx Signals: SIGTERM vs SIGQUIT

Nginx handles signals a little bit differently than Kubernetes expects. According to the Nginx documentation on signal handling, it will handle TERM and QUIT as follows:

 Nginx Signals
+-----------+--------------------+
| signal | response |
+-----------+--------------------+
| TERM, INT | fast shutdown |
| QUIT | graceful shutdown |
+-----------+--------------------+

So, when Kubernetes sends a SIGTERM to the nginx-ingress-controller pod, Nginx will enact a fast shutdown. If the controller is processing any requests during this time, they will be interrupted, resulting in dropped connections. This can cause a cascade of failures, resulting in a rise in HTTP 50x responses during pod termination.

Instead we want to give Nginx a SIGQUIT signal, which will prompt a graceful shutdown. Nginx will wait for connections to terminate before exiting, while not accepting any new connections.

The Solution: Lifecycle PreStop Hook

Kubernetes offers a preStop lifecycle hook for a pod spec. During termination, Kubernetes will call preStop after signaling that the pod is Terminating but before sending the pod a SIGTERM signal.

The preStop hook allows the pod to call any command before the termination signal. For this case, we want to pre-emptively send Nginx the SIGQUIT signal, so that the controller will gracefully terminate connections.

More info on preStop can be found here.

How to Send Nginx SIGQUIT

The nginx-ingress-controller image,

quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.24.1

includes a command for sending Nginx termination signals.

Running the following script in this image:

/usr/local/openresty/nginx/sbin/nginx -c /etc/nginx/nginx.conf -s quit
while pgrep -x nginx; do
sleep 1
done

will send Nginx a SIGQUIT signal and wait for the process to terminate.

Note that this script by itself can possibly run indefinitely, if the nginx process never quits. More info on how to add a timeout to this later.

Defining the preStop hook

We can put the above script into a one-line command, and add this to the lifecycle section of a pod spec, as follows:

lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 5; /usr/local/openresty/nginx/sbin/nginx -c /etc/nginx/nginx.conf -s quit; while pgrep -x nginx; do sleep 1; done"]

Note that there is a sleep 5 command before the actual script. This will wait for any kubernetes-related race conditions to pass before initiating a graceful shutdown. As of this writing it is unclear why this is necessary, but during testing nginx would still drop connections unless this sleep was enacted.

Defining terminationGracePeriodSeconds

A preStop hook will certainly give us the QUIT signal needed to shut down nginx gracefully. However, by default Kubernetes has given only 30 seconds to go through the entire Termination process before enacting SIGKILL. This means both the preStop hook and Kubernetes' SIGTERM signal need to complete before 30s is up, or Nginx will still suffer from ungraceful shutdown and dropped connections.

To mitigate this, we can increase the grace period for Kubernetes to terminate the way we want. This is defined in the Pod spec, as follows:

spec:
terminationGracePeriodSeconds: 600

Note that if 30 seconds is enough time for your Nginx ingress setup, you may skip this part.

More information on termination grace period and hook handler execution can be found here.

Deploying the Fix with Zero Downtime

Just because we have solved the zero-downtime problem, does not mean we will have a zero-downtime deploy if we update the nginx Deployment. Remember that the old configuration still has the problem. The solution here is to have TWO load balancers and TWO nginx ingress controllers, and switch traffic between the two. Direct traffic to the temporary load balancer, update the original while there’s no traffic flowing through it, and then direct traffic back to the original ingress controller.

Step 1: Create a Second Ingress Controller

We deploy nginx-ingress-controller via helm:

helm upgrade --install nginx-ingress stable/nginx-ingress --namespace ingress -f nginx/values.yaml

This will create an nginx-ingress-controller in the ingress namespace. We have defined additional configuration in a values.yaml file.

We will need to install a second controller, which we can also do via helm (created a new namespace called ingress-temp):

helm upgrade --install nginx-ingress-temp stable/nginx-ingress --namespace ingress-temp -f nginx/values.yaml

NOTE: If you are running Helm 3 you will need to first make a namespace via kubectl:

kubectl create namespace ingress-temp

Now that we have two ingress controllers, it is time to direct traffic to the temporary controller so that we can upgrade the original without interruptions in traffic.

Step 2: Redirect Traffic to Temporary Controller

Both Nginx controllers are available to direct traffic to services, but your DNS provider is configured to talk to only one of the Load Balancers. Change the DNS definition for your websites to point at the load balancer created from nginx-ingress-temp. How to do this varies depending on your DNS provider.

Monitor Traffic Flow

It’s useful to log traffic from both controllers during this changeover. We can do this with kubectl in two separate terminal windows:

Before flipping the DNS switch, follow traffic logs from the original controller (ingress namespace):

kubectl logs -lcomponent=controller -ningress -f

You should see all traffic flowing through this controller. This is the one we will take offline and update with a preStop hook.

In a separate window, follow traffic logs from the temporary controller created in Step 1:

kubectl logs -lcomponent=controller -ningress-temp -f

Other than nginx starting itself, you should see no traffic logs from this controller.

Keep these windows open. We will monitor the change in traffic flow as we switch the DNS.

Change DNS to Temporary Controller

Now it is time to change traffic to the temporarily created Load Balancer. Kubernetes will automatically create and configure a Load Balancer when helm creates a new LoadBalancer service.

More about Kubernetes automatic creation of Load Balancers can be found here.

To obtain the External IP of the newly created Load balancer, run the following kubectl command:

kubectl get service -ningress-temp

Look for the LoadBalancer type service, and get the External-IP. For AWS, for example, this may look like:

EXTERNAL-IP
xxxxxxxxxxxxxxxxxxxxxx-xxxxxxxxxx.us-west-1.elb.amazonaws.com

Update your DNS provider to point your site to this ExternalIP. Depending on your settings, this change may take a while to propagate.

Monitor Change in Traffic Flow

Look back at your terminal windows monitoring traffic flow through the Nginx controllers. Notice that the logs for the original controller are becoming less frequent, and the temporary controller is picking up traffic.

Wait for traffic to fully drain from the original controller. Once this is completed, move on to Step 3.

Step 3: Redeploy Original Controller with PreStop Update

Once traffic has been drained from the original nginx-ingress-controller, it is safe to update the original nginx-ingress-controller deployment with the preStop hook.

Add the following configuration to your nginx values.yaml:

controller:
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 5; /usr/local/openresty/nginx/sbin/nginx -c /etc/nginx/nginx.conf -s quit; while pgrep -x nginx; do sleep 1; done"]
terminationGracePeriodSeconds: 600

Note again that terminationGracePeriodSeconds is OPTIONAL and the number of seconds is configurable based on your controller's needs.

Redeploy the original controller with helm:

helm upgrade --install nginx-ingress stable/nginx-ingress --namespace ingress --version 1.6.16 -f nginx/values.yaml

Ensure the controller is deployed with the preStop hook applied:

kubectl get deployment nginx-ingress-controller -ningress -oyaml

The original controller is ready to accept traffic again.

Step 4: Direct Traffic Back to Original Controller

To obtain the External IP of the original Load Balancer, run the following kubectl command:

kubectl get service -ningress-temp

Update your DNS to direct back to the original Load Balancer. Once again these changes may take a while to propagate, depending on your settings.

Monitor Traffic Flow Back to Original Controller

Go back to your terminal windows in which you are monitoring traffic for the two controllers. If the kubectl logs command timed out, you may have to invoke the command again.

If the above steps have gone according to plan, you should see traffic dropping in the temporary controller and picking back up in the original controller.

Once again, once traffic has been fully drained, proceed to the final step.

Step 5: Remove Extra Infrastructure

Now that traffic is flowing through the original controller, we no longer need the temporary one.

Use helm to delete the temporary controller, as follows:

helm delete --purge nginx-ingress-temp --namespace ingress-temp

Again, if running Helm 3, delete the temporary namespace via kubectl:

kubectl delete namespace ingress-temp

Note that removal of the extra controller should be done in a week or so depending on how long the apps cache and keep connections alive.

Step 6: Celebrate!

Congratulations, you have deployed a zero-downtime nginx ingress controller! Now is the time to pat yourself on the back for a job well done.

Future Nginx Deployments

After following the above steps, you should have the ability to deploy future changes normally, with a simple helm update command. Nginx will handle traffic gracefully, and new controllers will come up without any interruptions in traffic.

However, if you are making risky changes to your ingress controller or prefer an abundance of caution when making any production-level changes, this is a great method for updating your infrastructure. This method allows for easy switching between a known good controller and one with untested changes, such as switching between versions or sources. Simply switch the DNS to the changed controller, monitor logs for unwanted behavior, and switch back quickly if breaking changes are detected.

Lindsay Landry, DevOps Engineer at Codecademy

--

--