Migrating from ECS to EKS: Service discovery

We recently migrated our production applications from Amazon Elastic Container Service (ECS) to Amazon Kubernetes Service (EKS). We wrote previously about log shipping to CloudWatch. This post is about utilising our service mesh, provided by Linkerd, to allow service discovery and inter-cluster communication.

Amazon are a cloud compute company, but many of their original products and features point to making it easier buy more EC2 instances. This is the core inventory of their platform; using more instances. With this in mind, it doesn’t really matter which container scheduler you use, the common denominator is EC2 instances. Wrap the instances up in a few autoscaling and security groups and you can mesh the instances into a unified set of services (in our case using Linkerd). Below are a bunch of option that we considered and tested, each one has a verdict about what we did / didn’t like.

When we set out on the migration from ECS to EKS we had a bunch of requirements;

  1. No big-bang rewrites
  2. Change one thing at once
  3. Any changes we make, should be changes that exploit native Kubernetes features and not home-grown hacks.

This meant that we didn’t want to replace our service discovery system (Consul) and service mesh (Linkerd) as part of this process.

Linkerd provides the service mesh, and using dtabs defines the rules for how services can communicate with each other. It also implements retries and timeouts for inter-service communication. It was also enabled the migration, allowing us to easily mesh the two clusters (ECS and EKS). Services on either cluster only needed to know about Linkerd and nothing else.

Existing ECS setup

We were running Linkerd on our ECS nodes in linker-to-linker mode. Having Linkerd on both ends of the communication between instances allows transparent TLS and a bunch of other benefits (timeouts, retries, circuit breaking, etc.) It means that we don’t have to implement that logic in all the new services we create. The entire Linkerd setup is out of scope of this article, but we’ve written about it before.

The diagram on the left shows an example; the login service talking to the user service would have a request flow that went through Linkerd when leaving and entering the node. This way you would get TLS encrypted communication between nodes for free. Linkerd in this linker-to-linker mode provides load balancing and service discovery by integrating with a service discovery backend. For us, this service discovery backend is consul.

To be able to migrate from ECS to EKS, and use Linkerd in the current service mesh format, we would need consul (the service discovery backend) to be updated with the addresses of services irrelevant of which container scheduler they were being run by.

Updating Consul — Options:

Option 1: replicate our ECS setup in EKS

Our ECS setup exploited the user-data of EC2 instances launch config to run three docker containers; Linkerd and Consul Agent and Registrator. The setup was already capable of registering (and de-registering) services with consul when those service started and stopped, using the Registrator (gliderlabs/registrator) container on every node. This container listens to the docker socket for starting and stopping container events, and then updates a local consul agent (which is responsible for updating the consul cluster). It setup, for any given node, looks something like this:

containers running on a node for service discovery

Here we see Linkerd and registrator using the local consul agent for service discovery.

Option 1 consisted of attempting to replicate this registrator → consul-agent setup on the EKS nodes, just how it was running on the ECS nodes. The problem is that pods run by Kubernetes are not as simple as tasks in ECS. Each pod in Kubernetes has a “pause container” that creates and maintains the shared namespace that all of the rest of the containers in the pod will use. It’s the first to start and all other containers are spawned from it. This allows the application containers to restart without losing the namespace.

Having a pause container start first means that the registrator will register, to consul-agent, the IP address of the pause container and not the IP address application container that starts afterwards. We could run consul-agentand linkerdas DaemonSets in Kubernetes, mirroring the once-per-node setup that we have in ECS, but we needed to replace the gliderlabs/registrator.

The goal is to eventually have a native Kubernetes setup, so this was an opportunity to take a step in that direction.

Verdict: we could have made some of the open source forks of registrator work, but we wanted to make a step towards Kubernetes native service discovery and this wouldn’t have been moving in the right direction. Rejected this option due to its complexity.

Option 2: Kubernetes Service with consul-k8s

A service in Kubernetes is how a set of pods are exposed to consumers. They translate almost directly into a consul service. Each of the pods that make up a Kubernetes service should be registered as an address of it’s corresponding consul service. Kubernetes services also have the benefits of respecting the running state of a container. A pod and the containers that make it up have probes (essentially health checks) that determine that container is ready to receive traffic. Taking the naive approach of the registrator and registering containers after they’ve started doesn’t respect if that container is ready to accept traffic. Kubernetes services do respect this, so we investigated how to sync kubernetes services into consul.

The week that we started investigating hashicorp announced consul-k8s . This project considers annotations on Kubernetes services and registers the backend pods of that service to consul, pretty much exactly what we needed. But there were some problems: firstly the project was about 1 week old and a bit buggy (pretty scary to put into production). Secondly, the fact we run Linkerd in a linker-to-linker setup became a problem.consul-k8s first iterations registered the pod IPs in consul, which do not match the node IPs.

The linker-to-linker setup that we use follows a flow similar to;

This means that for each of the pods that make up a service in service discovery we need the host IP and port. This is needed so that we can take the target address of the downstream service and change the port to get the address of the Linkerd container running on the target service’s node.

We can get a static port for pods on each node by exposing it as a NodePort service with Kubernetes. But at the time consul-k8s was not capable of registering the host IP of a pod in service discovery. Meaning that this port transformation approach would result in an address that didn’t match that of the Linkerd container.

Verdict: It would have been great to use a tool backed by a huge open source contributor like HashiCorp, but it wasn’t mature enough at the time and didn’t have the feature set we needed. Rejected because of missing features (and a few launch-week teething issues.)

Option 3: polling pod

We opted to create a single pod that contained an app that would use the Kubernetes API to check for services that should be registered (using annotations). Then register those services with a consul-agent that was running as a sidecar.

Using a consul-agent sidecar with a static host name and consul ID meant that even if this pod died, the restarted pod would assume the position of the previous pod, allowing sync to consul to work. It also gave us much finer-grained control over which pods we wanted to register and with which statuses Pending, Running, etc. And ensured that we could register the IP of the host that the pod was running on, solving the issues caused by consul-k8s in a linker-to-linker setup.

Verdict: This is gave us finer-grained control over what is registered in service discovery. Previously all the registrator style options would have no knowledge of the Kubernetes build in health checks. Services respect the liveness and readiness probes of a container allowing only ready-to-accept-traffic pods to be registered in service discovery.

How’s it going now?

Very well! We chose to replicate the Kubernetes services API state in consul using a polling pod. This allows us to have knowledge of pod and container health and respect the liveness and readiness probes, which is particularly important during deploys when containers are starting and stopping.

Ultimately the setup allowed us to teardown services from ECS and turn them up on EKS, one by one. This was done without any of the other services caring about which container scheduler was being used to orchestrate and run those applications. It allowed us to iteratively improve and migrate our apps and defended against the bugs often introduced in big-bang style changes.

We ran Linkerd in a linker-to-linker setup, with consul as the service discovery backend, but used a poller of the Kubernetes API to populate the list of services into consul. This allowed for only registering the “ready” status services in consul and eventually meant we could replace the service discovery backend entirely (replacing consul with Kubernetes serivces API) without significant changes to our Linkerd setup.

We’ve successfully torn down our ECS clusters in favour of the EKS ones, and have since updated to use Kubernetes service discovery mechanisms, removing the need for consul and any sync containers.