If only life were so simple. Art by black13roses

Who Would Win in a Fight Between Kubernetes and ECS?

Published in

Microscaling Systems

6 min readNov 22, 2016

We (Microbadger) currently run on AWS with Google’s Kubernetes (k8s) as our container orchestrator. In this post, I’ll talk about some of the differences between running AWS’s home-grown orchestrator ECS (Elastic Container Service) and k8s on AWS.

One difference between the two orchestrators is k8s is open source and ECS is not (actually the ECS agent is open source it’s the API / back end that is closed). However, we’re not going to focus on that, but on their functionality.

What are Instances and Clusters?

On AWS, a single machine is called an instance (aka “node” in k8s-speak). A cluster is a set of instances, or nodes, managed by an orchestrator.

To run a high availability (HA) containerized system you probably want to build a cluster.

Aside — if you don’t need high resilience there are other options, for example, you could have a single VM and run lots of containers on it to save costs. Dokku is a mini PaaS for this usecase https://github.com/dokku/dokku). You could also run your workers on multiple VMs but with only one master (see “What’s in a Cluster?” below). There are lots of alternatives. Not all systems need to be properly HA and it does cost time and money. We chose to build a cluster and this article is primary about them. Note though that currently our cluster is single AZ, single master so it isn’t HA either!

What‘s in a Cluster?

With ECS you just have one node type in your cluster — workers (they do the work). Each worker node runs an ECS agent and can host multiple containers.

With k8s you have 2 node types in your cluster: masters and workers. Each k8s worker runs a kubelet agent and can host multiple k8s “pods” — a pod is a set of one or more containerised applications that always deploy and run together.

You Cannot Have Two Masters

Unlike ECS, with k8s cluster you also need master node instances. ECS provides and manages masters for you, but with k8s you have to do this yourself.

Masters don’t host pods— their job is to control your cluster. Aside: note that in k8s you _can_ apparently get masters to host pods, though it must be fiddly because we haven’t worked it out, we think this may be a dev feature so you can run single node clusters? But it could make sense because I suspect the master machine is going to end up being underutilised a lot of the time (Thx @try_except_ for the heads up).
You need at least one master per cluster.
Because masters agree stuff via etcd you usually need a quorum of masters in your cluster (1, 3, 5 or 7) so for HA you can’t just have 2 masters, you need at least 3.

A k8s master node runs several processes

an API server (which handles requests from workers and from the kubectl command line tool)
a Proxy (which handles networking and service discovery)
a Controller Manager (which ensures everything is running as it should, e.g. restarting crashed nodes)
a Scheduler (which places and starts new pods or terminates running ones)

Networking

Both ECS and k8s have built-in service discovery (hurray!). The k8s networking is more sophisticated and automatically assigns a new IP address to every pod it creates (for which k8s creates a new subnet for each instance). With ECS, each container just gets a new port number.

ECS has service discovery because it integrates with the ELB and ALB (Application Load Balancer) services (note — ALB has better port mapping support than ELB).

We also use the Kubernetes DNS addon, which is very handy and commonly used. ECS comes with DNS.

Aside — K8s has some clever concepts that ECS just doesn’t have, for example “deployments”, which are a collection of pods. This is how the rolling deploys work. We also make significant use of k8s’ good metadata support (we use the label selector to control which pods to expose and “annotations” to say which cert to use). We’ll write a more detailed post in future about k8s metadata support because metadata is what we do at Microbadger :-)

Availability Zones & Network Issues

In theory, a cluster can span availability zones but AWS doesn’t understand k8s pod IP addresses, so that can cause routing issues between AZs (unless you use a networking solution like Calico). This is not a problem for ECS. Similarly, k8s will hit issues when you get to 50 nodes because of AWS’s 50 instance limits on subnet count. There are several solutions to that, which we’ve outlined in this post. Again ECS doesn’t have these issues on AWS.

Aside — The best practice for HA will be to have multiple clusters in separate AZ’s (as AZ’s can go down but region outages are much rarer).

Setup

You can set up an ECS cluster via the AWS UI by hand fairly easily.

For k8s we use kops to create our clusters and that works fine. kops is a tool specifically for bootstrapping k8s on AWS and it calls all the AWS APIs for us and spits out nice Terraform templates.

For k8s an alternative to kops that we’ll try out in future is kubeadm, which was added to k8s in 1.4. Kubeadm is a new process that runs on setup on every worker to make is easier to form clusters. <Thanks David Aronchick for info>.

Autoscaling Groups and Orchestrator Resilience

Fault tolerance at the container or pod level is provided by both orchestrators (ECS and k8s) so you can ensure that if one of your containerized applications falls over it is automatically restarted.

However, if you want to ensure your orchestrator itself is resilient on AWS (making sure all your masters don’t crash) then you have 2 options.

ECS manages this for you by managing your master.
For k8s, you can use the VM-level fault tolerance provided by AWS called Autoscaling Groups. With Autoscaling Groups, AWS automatically re-starts crashed instances (and their contents) to ensure a target number of instances is maintained. We put all our master instances in one autoscaling group to ensure we maintain a quorum. We do the same with workers (in a different autoscaling group) to make sure we don’t lose resource over time if workers crash.

So, we use AWS auto-scaling to ensure the right number of VMs, while K8s makes sure we have the right number of running container instances spread across those VMs.

Conclusion: Power or Simplicity?

Of the two, ECS is simpler to use but Kubernetes feels like a better long term bet at the moment. It’s easier to get started and do simple things with ECS but you can potentially do a lot more with k8s and you aren’t locked in to the AWS ecosystem. However, k8s takes more effort.

Why did we use K8s?

The main reason we went with k8s rather than ECS was because we felt we wouldn’t outgrow k8s. The fact that k8s was open source with an architecture we liked was compelling, especially when combined with their very active community.

k8s already has useful community-built tools like kops but we could see more tools on the horizon, enthusiastically being developed in public by both Google (for example kubeadm and federation) and by others like Deis (Helm) and RedHat (OpenShift).

We want to be part of that community; we want plenty of useful tools for the orchestrator we use and we want to be able to build tools ourselves. k8s is clearly very supportive of that and that’s fundamentally why we moved to it.

By Anne Currie, Ross Fairbanks & Liz Rice

Please hit the Recommend button below if you found this article interesting or helpful, so that others might be more likely to find it.

Check out MicroBadger to explore image metadata, and follow Microscaling Systems on Twitter.