So Docker Swarm is great for quickly spinning up a cluster of Docker hosts and deploying services; there is a low learning curve running in Swarm mode compared with single engine mode Docker. Kubernetes by comparison is much more complex and requires a lot of setup and administration. Despite this Kubernetes is becoming the de-facto choice in industry over Swarm, probably due to the power and flexability of Kubernetes combined with the availability of cloud managed Kubernetes as a service such as AWS EKS.
For example when configuring ingress with Kubernetes you create a service which listens on a certain hostname and port, and then based on pod labels the traffic can be routed through to pods. This is really powerful as by updating a label you can re-route the traffic without creating or killing any new pods.
To do this in Swarm is more tricky it would seem… products such as Traefik provide a reverse proxy that can have rules created dynamically based on container labels in a similar fashion to Kubernetes, however I wanted to explore how best to route traffic natively in Swarm, for more simple and static environments. I don’t pretend to know all the answers but after researching I found two main solutions:
Option 1: Routing off cluster
As only one Swarm service can publish on any one port via the ingress overlay network, we need to have different ports for all of our different services; service 1 on 8081, service 2 on 8082 etc. The downside to this is that as an end user you don’t want to use unusual ports to access your services, you want to use the natural 80 and 443 web traffic ports.
In order to do this, a load balancer needs to be set up off cluster to listen on these web ports, and then route traffic to the Swarm cluster based on the hostname.
This works well, however it means multiple ports have to be managed in Swarm and you have a single point of failure with your load balancer (unless you make this resilient).
Option 2: reverse proxy in Swarm
This method sees a reverse proxy service started on Swarm that publishes the traditional ports of 80 and 443. This service has to be connected to an overlay network that is attachable, and then subsequent services need to also be attached to that same network. In this way the reverse proxy can route to all other services based on the hostname.
This reverse proxy can be many things, a simple Nginx server, or something more powerful such as Traefik that will look at labels to create proxy rules dynamically.
I like this method as it means that the reverse proxy is within the cluster and can therefore be scaled nicely to provide resilience.
Option 2 seems like the way to go, however this method does raise concerns when working with more complex stack deployments. Say for example you have a service with a front end and a back end. The user only talks to the front end, not the back end, but in this option we are connecting all our services to the same network as the reverse proxy for routing. This now means that any services on that network can access that back end, which is a security risk.
The way to get around this is to have a 3rd overlay network connecting your stack together, and then connect just the front end of the stack to the central overlay network as well the new stack network that the backend services are connected to. This way the back ends are separated from the main network, and the traffic can still route through.
This took me a while to figure out and I didn’t see any articles (I’m sure there are lots) mentioning multiple overlay networks to separate out the back ends specifically so I thought it was worth discussing.