Kubernetes Tip: How To Gracefully Handle Pod Shutdown?
There are many use-cases where Pod delete requires to be handled gracefully. Few examples are; logs of the deleting pod to be stored in a remote location, Current requests/jobs to be processed before Pod deletes, or update certain rules/fields before shut down. Such use-cases require an understanding of how the shutdown process works that help better designing the system.
Here is the flow of events when the Pod get’s deleted.
The pictures consider a user deleting the pod using either delete CLI or delete API while this operation can occur in many ways such as auto-scaling, rolling updates, etc. Irrespective of the case, the flow of events remains the same as described in the picture which is elaborated below.
When api-server receives a delete pod message, it sets the status of the pod to terminating state.
Kube-proxy running as daemonset on all nodes watches for the status change of the pod to remove the terminating pod from service endpoints.
Kubelet actually does grunt of the work. For each container in the pod, Kubelet checks If the preStop hook is configured run preStop script for graceful shutdown. If the preStop hook is not configured, Kubelet sends a SIG_TERM signal to containers' main process(PID 1) for it to shutdown gracefully. After grace-terminate-period, Kubelet will forcefully kill any running containers in both the cases described.
After completing all these operations, Kubelet updates api-server to remove the pod completely.
By default, graceful-terminate-period is set to 30 seconds but is a configurable parameter as part of pod Spec
Alternatively, One can use CLI and provide a graceful-terminate-period as an option during a delete operation.
Updated Based On Comments.
For a truly graceful shutdown, event Remove_Pod_From_Service must occur before event Send_SIG_TERM_Signal for the pod to handle the last few requests gracefully but these events occur asynchronously with no guaranteed order. Ordering is achieved by having a preStop hook that sleeps for a few seconds expecting Remove_Pod_From_Service occurs before Send_SIG_TERM_Signal.
The recommendation is to configure preStop hook and graceful-terminate-period for applications such that Remove_Pod_From_Service event always occurs before Send_SIG_TERM_Signal event.
I hope this helps. Appreciate your comments.