Graceful Shutdown of Spring Boot Applications in Kubernetes

Published in

Trendyol Tech

7 min readNov 24, 2020

As Trendyol Delivery team, we manage the deployment of our applications in Kubernetes. We have 3 separate Kubernetes clusters and we determine the number of our pods in these clusters according to the load our applications take.

When we start a new deployment, our existing pods are terminated, and new pods are replaced with them. If there is a long operation in old pods during this deployment process, we want to kill the pod after this operation is completed successfully (Graceful Shutdown). Otherwise, the killed pods may not respond successfully to the latest requests. Especially, if we have an API that receives a lot of traffic, our error rate increases significantly during the deployment process. Therefore, it is essential for us to make the deployment as fast and error-free as possible.

Kubernetes Termination Lifecycle

Kubernetes always monitors the system, makes the difference, if there is a problem, it reacts immediately and tries to make the applications work as we want (Desired State).

Kubernetes helps keep the entire system in good health with this cycle. If the healthcheck sent to a pod that returns an unsuccessful response or when we make a new deployment, this pod will be replaced with a new one. Old pods are terminated as quickly and without errors as possible.

While the pod is terminating, the data should be successfully saved in the database. Network connections, database connections, etc. should be closed successfully. This is called Graceful Shutdown.

Generally, applications can gracefully shut down when they receive SIGTERM signal, but this may not happen due to some third party applications or the structures we cannot control outside the system. So we may need preStop hook.

The following 5 steps occur when Kubernetes kills a pod:

1- The pod switches to Terminating state and stops receiving any new traffic. Container is still running inside the pod.

2- preStop hook that is a special command or HTTP request is executed, and is sent to the container inside the pod.

3- SIGTERM signal is sent to pod and the container realizes that it will close soon.

4- Kubernetes waits for a grace period (terminationGracePeriodSeconds). This waiting is parallel to preStop hook and SIGTERM signal executions (default 30 sec). So, Kubernetes doesn’t wait for these to finish. If this period is finished, it goes directly to the next step. It is very important to correctly set the value of the grace period.

5- SIGKILL signal is sent to the pod, and the pod is removed. If the container is still running after the grace period, the pod is forcibly removed by SIGKILL, and the termination is finished.

Liveness & Readiness Probes

The other important concept of Kubernetes is probes that is a periodic scanning operation to know the status (health, traffic, etc.) of a pod in the cluster.

There are 3 probes on Kubernetes such as Liveness, Readiness, and Startup Probes.

Liveness Probes: Does the pod work as healthy?
Readiness Probes: Can the pod accept the request?
Startup Probes: Has the application in the container been started successfully?

Probes return as Success, Failure, and Unknown statuses.

Also, there are 3 probes execution methods such as ExecAction, TCPSocketAction, and HTTPGetAction. The most common of these is HTTPGetAction.

It is a configuration example of liveness and readiness probes in deployment.yml file above.

httpGet is keyword of HTTPGetAction probe. Kubernetes sends GET request to /index.html endpoint and 80 port, and waits a status from liveness probe.
initialDelaySeconds: 10 means that send the GET request after waiting 10 seconds.
timeoutSeconds: 3 means that if there is no response within 3 seconds, time out.
periodSeconds: 5 means that check by sending this request every 5 seconds.
failureThreshold: 3 means that let it fail 3 times.

If the hook doesn’t receive UP status for readiness, then traffic won’t be routed to that port. The pod won’t be restart.

If our HTTP GET request to liveness doesn’t receive UP status in the timing conditions above, the pod will restart.

Spring Boot Actuator

Spring Boot Actuator allows us to get information about the general status of our application.

We need to add the application as a Maven dependency.

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

It provides us with a lot of information such as the working and healthy status of the application, traffic status, last HTTP requests, etc. We get this information with the help of HTTP endpoints provided by Actuator (localhost:8080/actuator).

While our application is running, if we send an HTPP GET request to localhost:8080/actuator/health endpoint, the health status of the application is returned (Healthcheck). The most used actuator information is health.

Health Status is UP for the Running Application

We can either use Spring boot default healthcheck, or create a custom healthcheck according to the criteria we define.

One of the main features of Kubernetes is that checking regularly the health of the overall application and replacing unhealthy instances with healthy instances (pods). A pod must be in healthy (UP ) status to be able to run in Kubernetes node. There are 2 properties to check this: Liveness & Readiness Probes.

Our Solution

We created a custom healthcheck service with Spring Boot to implement Graceful Shutdown in our API.

Custom Health Service to implement unhealthy()

The unhealthy() method sets the healthy flag to false and waits for 15 seconds here.

Custom Health Check Configuration

The CustomHealthCheck class extends the AbstractHealthIndicator class and allows us to build a custom healthcheck structure by overriding the doHealthCheck() method. According to the flag we receive from HealthService, we set the system’s health status up or down.

Custom Health Controller to get /unhealthy

HealthController creates a rest controller to call unhealthy() service. We will call the /unhealthy endpoint with the preStop hook.

Then, we configured the values such as readiness & liveness probes, preStop hook, and terminationGracePeriodSeconds in the deployment.yml file we created to deploy our application to Kubernetes.

When we examined the deployment.yml above, we stated that we will use the /actuator/health endpoint for liveness and readiness probes. Also, we set some timing values.

We specified that we will use the application’s /unhealthy endpoint for the preStop hook.

When we start a new deployment, the old pod changes to Terminating state. At this point, Services removes IP of this pod from iptable and prevents new traffic from coming.

After that, preStop hook and SIGTERM signal is sent to the terminating pod. The application that receives the SIGTERM message starts preparations for shutdown and tries to close the connections.

In parallel, the preStop hook calls the /unhealthy endpoint and sets the application’s health status to DOWN and waits here for the terminationGracePeriodSeconds value (15 sec.). The reason for this is to ensure that the terminationGracePeriodSeconds will wait up to 15 seconds. If our application shuts down before the grace period, terminationGracePeriodSeconds does not wait 15 seconds. It kills the pod directly. We put to sleep a simple operation on the Tomcat server, keep the Spring Boot application busy and wait for this time. Here, we can code some features we want before the waiting process (connections closing, logging, etc.).

Especially in lower-level languages such as GoLang, we should close connections by doing this kind of procedure. Spring Boot, on the other hand, manages these operations very successfully, so we usually do not need to code manually. But in some cases, especially in high traffic, it is necessary to do the shutdown process in the code.

For example, when we deploy Shipping-API that has 100k rpm, we were getting some errors (BeanException) even if it was a little (<1%). While the pod being in Terminate status and was processing the final requests, since some Spring Beans were turned off in the deep code, it could not find these beans and threw BeanException. We keep the Tomcat server waiting to prevent some beans from shutting down early. In this way, we were able to eliminate mistakes.

At the end of 15 seconds, the operations on the pods have been successfully completed and the connections have been closed. With the SIGKILL signal, old pods are removed and only new pods are standing in the system.

I tried to explain how we are implementing the Graceful Shutdown as a team.

Thanks for reading.

#trendyoltech

References

Kubernetes best practices: terminating with grace | Google Cloud Blog

Learn how you can help Kubernetes do its job more efficiently and reduce the downtime your applications experience.

cloud.google.com

Spring Boot Actuator: Production-ready Features

Actuator endpoints let you monitor and interact with your application. Spring Boot includes a number of built-in…

docs.spring.io