Graceful shutdown with Play and Kubernetes

We recently migrated our platform towards Kubernetes. That’s for sure one of the best move, technologically speaking, that we’ve ever done. It provides stability, scalability, service discovering, centralized configuration, rolling upgrade… the list could keep going.

When you move to a platform for automating deployment and scaling like Kubernetes/DCOS, one of the things you absolutely need to get right is a proper implementation of health checks (liveness and readiness) and a graceful shutdown of your application (on top of compute resources, but that’s another topic).

We use a lot Play framework, it’s a great Java/Scala framework to develop web applications. But in order to have a smooth deployment/scaling down process with Kubernetes, you need a proper graceful shutdown of your application. In a web context, that could mean:

  • Wait for current requests to finish
  • Refuse any new incoming requests
  • Cancel any task scheduled on the Akka actor system scheduler
  • Shutdown database connections

Unfortunately, all those things don’t come for free with Play. To give you the opportunity to shutdown gracefully, Kubernetes first sends a SIGTERM signal to your application. Play’s behaviour is to shutdown everything immediately upon receiving that signal. That means current requests don’t complete and clients receive a “connection closed” error. Not very “graceful”…

First, we need to handle SIGTERM in a better way. We can’t let the application shutdown. That’s the purpose of this signal handler singleton service:

By swallowing SIGTERM we allow the current requests to finish, and we use this opportunity to delay the shutdown of the application by using the Akka scheduler.

There is still one issue: the application is still running, so potentially it can receive new requests. To make sure the application is not served by Kubernetes anymore, we need to leverage the SignalHandler in the liveness/readiness implementation:

Once SIGTERM has been swallowed, the method isShuttingDow will start returning true , which will turn into a 503 HTTP response, therefore the application won’t receive any new request, because considered as unhealthy.

Combined to a long terminationGracePeriodSeconds configuration in your pod, this gives enough time to the application to finish processing the current requests. That’s how we achieved no downtime deployment.

Like what you read? Give Edouard Kaiser a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.