Some things in life are worth waiting for, even other services

7 min readDec 9, 2022

Introduction

As an application developer, there are many considerations we have when building out cool features for our Go microservices. One such consideration is for handling the waiting of service dependencies. This is when you define service dependencies, and how long to wait for the service dependency connection to become available. Often times you may have one service reliant upon another service or database to be up and running before the application should accept requests and become available itself.

When it comes to the topic of establishing a service wait strategy for its dependent services, the term fault tolerance may come up.

Fault tolerance is the idea of an application failing fast and guiding how/when requests occur, as well as establishing a fallback strategy for handling common errors.

Fault tolerance is related to service startup waiting for dependencies, as you want to consider what the expected behavior is when a service dependency never becomes available. You also should decide what cutoff construct (ie. how long to wait) makes the most sense for your service and overall solution.

This blog post will cover the topic of having a Go microservice that has a dependency on another service, and it waiting for its dependencies to start before the service itself becomes available. It walks through the different layers one can implement a service startup wait strategy, options within the Go community, my personal experience using one of those Go options, learnings along the way, and a conclusion section to sum things up from there.

Can’t wait to start the following sections?! Me too! Pun intended 😉

What Layer to Apply Service Startup Waiting?

There are a few different places one could implement the waiting of service dependencies.

Docker Layer:

Docker compose has the depends_on keyword such that
you can specify the startup sequence for applications and their dependencies. This means that the application may have prerequisite services that it depends on. Unfortunately, this key word is no longer supported in Version 3 of docker compose.
Docker has the Entrypoint keyword that may be overwritten or wrapped to leverage a script to define a wait strategy. Specifically, you can specify the Entrypoint override in the docker compose file to add the wrapper script. A common script used for this is the Vishnubob bash script. However, after overriding the Dockerfile Entrypoint, you may have to append the Dockerfile CMD section to the docker compose Entrypoint, as this is a known issue in docker compose where the Entrypoint override wipes out the Dockerfile CMD.

Within the Go microservice application code itself:

Establishing the waiting of service dependencies within the Go application logic itself is useful when running Go microservices natively. This means you can have the same startup wait strategy set for your service in a Docker environment, and when you run the service on your machine locally without Docker.

You should choose the approach that makes the most sense for your project. You also want this wait strategy reflected in your integration tests.

Go Tool & Package Options Overview

There are two Go options for establishing service startup wait strategies. Please note, this is me acknowledging personal bias based on my experience using wait-for-it.

Wait-for-it (v0.2.13):

This Go tool was inspired by the Vishnubob bash script. It is a Go utility to wait for the availability of a TCP host and port. The current implementation allows you to generate a binary to run within your Docker containers to wait for other services in the Docker layer approach. This requires you to work with an image that has Go installed. It is not packaged.
MIT license and has recent repository activity.

Net-wait-go (v1.3):

This is a utility and Go package to wait for a network port to become
available on the server side. The authors have packaged this implementation, and made it so that you can work with it from within your Go microservice application logic, or within the Docker layer.
Unlicensed and has no activity in the past 2 years.

Personal Experience Using Wait-for-it

My current project needed startup waiting defined for our services within our Go code because we wanted our services running natively to run the same when run with Docker. We found out about wait-for-it and proceeded from there.

Because wait-for-it in its current state is not packaged such that others may consume it, we ended up copy/pasting in the logic into our application pkg/directory, while providing appropriate credit to the author. We created:

├─ project/
│  ├─ ...
│  ├─ pkg/
|  |  ├─ ...
|  |  ├─ wait/
│  |  |  ├─ services.go
│  |  |  └─ wait.go
│  |  └─ ...
│  └─ ...
└─ ...

services.go contains information related to wait.Services that we defined as such:

type Services []Service
type Service string

const (
  ServiceMyService Service = "myservice:8080"
  ...

  ServiceMaxTimeout = 1 * time.Minute
)

var (
  ErrServiceMaxTimeout = fmt.Errorf("max service startup timeout duration of %d exceeded waiting for service dependencies", ServiceMaxTimeout)
)

wait.go contains information related to the wait logic defined within the wait-for-it repository with minor modifications:

// wait waits for all services
func wait(lc logger.LoggingClient, services Services, tSeconds int) bool {
 t := time.Duration(tSeconds) * time.Second
 now := time.Now()

 var wg sync.WaitGroup
 wg.Add(len(services))
 success := make(chan bool, 1)

 go func() {
  for _, service := range services {
   go waitOne(lc, service, &wg, now)
  }
  wg.Wait()
  success <- true
 }()

 select {
 case <-success:
  return true
 case <-time.After(t):
  return false
 }
}

func waitOne(lc logger.LoggingClient, service Service, wg *sync.WaitGroup, start time.Time) {
 defer wg.Done()
 for {
  _, err := net.Dial("tcp", string(service))
  if err == nil {
   lc.Infof("%s is available after %s", service, time.Since(start))
   break
  }
  opErr, ok := err.(*net.OpError)
  if ok && errors.Is(err, opErr) {
   lc.Errorf("failed to dial service %s with error: %s", service, opErr.Error())
   break
  }
  time.Sleep(time.Second)
 }
}

A minor addition of wait.ForDependency(…) was added to wait.go as shown below:

// ForDependencies allows the service to wait for its dependencies to be up and ready
// for a configurable amount of time. If the service dependency request timeout is reached
// and the dependent services are not available yet,
// then the timeout wait interval will continue until the dependencies are up for a maximum wait time of 1 minute.
func ForDependencies(lc logger.LoggingClient, dependentServices Services, serviceRequestTimeout time.Duration) error {
 serviceTimeout := int(serviceRequestTimeout.Seconds())
 var err error
 success := make(chan bool, 1)

 if len(dependentServices) != 0 {
  lc.Infof("Service startup timeout invoked to wait %d seconds for dependent services %s", serviceTimeout, dependentServices)
  ok := wait(lc, dependentServices, serviceTimeout)
  if ok {
   success <- true
  } else {
   lc.Infof("Waiting for service dependencies to become available...")
  }
 }

 // return err if service wait time exceeds ServiceMaxTimeout time
 select {
 case <-success:
  return nil
 case <-time.After(ServiceMaxTimeout):
  err = ErrServiceMaxTimeout
 }
 return err
}

This packaged setup enabled us to bootstrap our main.go to force our controller to wait.ForDependencies where we had our service dependencies defined as a field within each server struct. If the service dependencies never became available within the time period we specified in our service configuration file, then the service exited logging the failure.

Learnings

Sometimes it’s hard to know what layer makes the most sense when
choosing if you want the service startup waiting defined in your Go
application code, or within your Docker layer.

For us, it made sense to implement it within the Go microservice application logic, as we wanted the same logic applied if we ran the applications natively or using Docker. We also weren’t 100% sure Docker was going to be installed on the Windows machine running some of our services. We needed a way to have the same service startup waiting defined for the Windows-based Go microservices, as well as our Go microservices running on a Linux machine.

The current implementation of wait-for-it is great, but not perfect — as is the case for many things.

There are a few drawbacks we found with the current implementation
of the utility. For starters, it is not packaged. I have some changes locally I need to open a PR up for on the wait-for-it repository to help change this for others moving forward.
In addition, copy/pasting in only the wait logic of their repository removed the service wait timeout configurability. Service timeout configuration was a flag on the wait-for-it Go utility; therefore, when copy/pasting over only the wait logic, this removed the establishment of a default timeout time — thus, causing our services to wait forever if a dependency never started. We had to add this back in on our end when packaging it up to correct this.

It is important to consider your fault tolerance strategy in the case that your service’s dependencies never become available, and how long you are willing to wait for.

Within wait.ForDependencies() we added a ServiceMaxTimeout configuration. We bootstrapped this timeout configuration to our service configuration setup, and a success channel waiting to hear if the service dependency became available. If the service dependency never became available, then we had timer.After(ServiceMaxTimeout) setup such that if the max timeout exceeded, then an error was created for the waiting service. Upon error, the service would exit as it’s dependencies never became available in time. This is what made sense for us, but may not be the best choice for other use cases.

Conclusion

On a final note, they say that things worth having in life are worth waiting for. I fully agree, and naturally apply that thought to my day in the life as a Go microservice application developer 😊. Ensuring that your service’s dependencies are up and ready is worth a minor delayed start for your solution before accepting connections and requests that expect everything to be ready to go.
Waiting for service dependencies is common for containerized microservice-based solutions, and should be thought through to determine which layer of your application makes sense to establish the waiting implementation, what fault tolerance strategy to use, and how you choose to best implement it. Of course, there are always pros/cons of every decision we make!
Currently, I am wrapping up changes to a forked version of the wait-for-it repository hoping to open a PR soon. Since there has been recent movement within the repository, it should be a great opportunity of improving an Open Source tool that my team and I have personally consumed.