Gracefully Shutdown your Go Application

Alfian Dhimas
Tokopedia Engineering
5 min readMar 9, 2021
Go with graceful shutdown
A modified version of Go Logo

If you would like to read in a different ambiance, try to read the original post on my personal site.

A Common Problem

Every time you deploy your code changes. image source

Looks familiar? I hope it doesn’t. But if it does, you may need to diagnose and debunk your application further. The image above describes a common problem that most web server applications may have encountered. Every time we deploy new code changes, the services will be terminated first to have the new changes up & running. The problem happens when in that timeframe, the service is serving ongoing requests, and those requests will be responded with error as those requests aren’t processed if the application isn’t shutting down gracefully.

At Tokopedia, we have a lot of deployments across multiple services every day, thus having a graceful shutdown implemented in our services really helps us in smoothening every deployment without negatively impacting ongoing processes.

Beyond that specific problem, this article aims to extend it to broad use cases for any Go applications.

What should we do?

Before we dive into how do we address this problem, let’s define what Graceful Shutdown in a process is

A graceful shutdown in a process is when a process is turned off by the operating system (OS) is allowed to perform its tasks of safely shutting down processes and closing connections

So it means that any cleanup task should be done before the application exits, whether it’s a server completing the ongoing requests, removing temporary files, etc.

Graceful shutdown is also one of The Twelve-Factor App, which is Disposability.

To be able to achieve that, we have to listen to termination signals are sent to the application by the process manager, and act accordingly. This means that your app should not terminate itself immediately when the process manager orders it to do a graceful shutdown.

So with all information that we have right now, what we want to do is basically:

  1. Listen for the termination signal/s from the process manager like SIGTERM
  2. We block the main function until the signal is received
  3. After we received the signal, we will do clean-ups on our app and wait until those clean-up operations are done.
  4. We also need to have a timeout to ensure that the operation won’t hang up the system.

The Code

Talk is cheap. Show me the code. — Linus Torvalds

Enough theory, let’s see how the previous steps could be turned into a working sample code.

The code itself is pretty straightforward. The main benefit of this approach is that it’s reusable, scalable, and easy to maintain when our application grows.

The code checks up all of the previous lists that need to be done to have a graceful shutdown application. Let’s break it down:

  • Listen for the termination signal/s from the process manager like SIGTERM
s := make(chan os.Signal, 1)   // add any other syscalls that you want to be notified with  signal.Notify(s, syscall.SIGINT, syscall.SIGTERM, syscall.SIGHUP)  <-s

It could be only one termination signal or multiple, it really depends on your app behavior, common apps may at least need to listen on SIGTERM.

  • We block the main function until the signal is received
// wait for termination signal and register database & http server clean-up operationswait := gracefulShutdown(context.Background(), 2 * time.Second, map[string]operation{
"database": func(ctx context.Context) error {
return db.Shutdown()
},
"http-server": func(ctx context.Context) error {
return server.Shutdown()
},
// Add other cleanup operations here
})

<-wait

This code could also be extended for doing other clean-up or utility operations, e.g.: Closing Redis Connections, Send post-serving metrics, release any resource that was being used for profiling & diagnostic of the application, etc. Please refer to resource links that I put in the last section of this article for some common use cases.

  • After we received the signal, we will do clean-ups on our app and wait until those clean-up operations are done.
var wg sync.WaitGroup

// Do the operations asynchronously to save time
for key, op := range ops {
wg.Add(1)
innerOp := op
innerKey := key
go func() {
defer wg.Done()

log.Printf("cleaning up: %s", innerKey)
if err := innerOp(ctx); err != nil {
log.Printf("%s: clean up failed: %s", innerKey, err.Error())
return
}

log.Printf("%s was shutdown gracefully", innerKey)
}()
}

wg.Wait()

We want to be as fast as possible when doing cleanup operations at shutdown time, that’s why we spawn goroutine for doing every operation.

One thing to remember for this part is that we have to make sure that there’s no resource sharing between cleanup operations, otherwise it may lead to race condition due to concurrency happening here.

  • We also need to have a timeout to ensure that the operation won’t hang up the system.
// set timeout for the ops to be done to prevent system hang
timeoutFunc := time.AfterFunc(timeout, func() {
log.Printf("timeout %d ms has been elapsed, force exit", timeout.Milliseconds())
os.Exit(0)
})

defer timeoutFunc.Stop()

So that’s graceful shutdown implementation for general Go Applications. The sample code implementation is written in Go, but the core idea can be applied to other languages as well.

Conclusion

Graceful shutdown is only one of many things that you need to implement to have a resilient & robust application.

Besides, we may also still need to figure out how to route incoming requests/new tasks to the application that has the latest version of our code when we’re having a deployment. This may need its own article that I might write in the future, so stay tuned!

These are the related external resources that you might find important to learn further on how to handle specific resource cleanup:

I hope you found this article useful. If you do, please share it with others who may need it.

Thanks for reading! Looking forward to hearing your feedback & suggestions.

--

--

Alfian Dhimas
Tokopedia Engineering

Machine sees me as a polyglot. I see myself as a lifelong learner.