The mysterious tale of 0.0.0.0:tcp://172.20.244.217:8080
As you may know, I like my talks to have live demos wherever possible, and my chalk-talk sessions for AWS reInvent are no exception. I’ll be talking about different models for runtime security in containers, and showing these models in three AWS container environments:
- ECS — regular containers on regular EC2 VMs
- Fargate — container instances where we don’t have to deal with the VMs they run on
- EKS — the AWS managed Kubernetes service
For my demo I’m using a little application called gotty which runs a browser-accessible terminal, and then showing how we can limit the commands available in this terminal using container security solutions like Aqua’s. This simulates what might happen if an attacker opened a reverse shell into a containers.
I had my gotty containers running just fine in ECS and Fargate, but they failed quickly on EKS. Examining the logs I’d see something like this:
$ kubectl logs gotty-deployment-5b9f598b7b-ls442
Error: failed to listen at `0.0.0.0:tcp://172.20.244.217:8080`: listen tcp: address 0.0.0.0:tcp://172.20.244.217:8080: too many colons in address
failed to listen at `0.0.0.0:tcp://172.20.244.217:8080`: listen tcp: address 0.0.0.0:tcp://172.20.244.217:8080: too many colons in address
The gotty app was failing to start because of a malformed address. Why on earth would it pick up such a strange address on Kubernetes but not the other container environments?
I tried on a local Kubernetes cluster and found the same thing.
Have you guessed why yet?
Ye olde service discovery
The gotty executable listens on an IP address and port specified by some environment variables, not unreasonably called GOTTY_ADDRESS and GOTTY_PORT.
My YAML has the a gotty pod sitting behind a service called gotty, which also seems pretty reasonable to me.
The trouble is that Kubernetes defines an awful lot of environment variables in an attempt to help pods locate services. This (presumably) pre-dates the ability to use DNS for service discovery, and is based on (now legacy) Docker behaviour. If for some strange reason you’re not using DNS, your application code can locate neighbouring services using the addresses and ports defined in these environment variables.
As far as I can tell you don’t have any choice about this, and ALL the environment variables for any existing services get defined when you create a pod. I haven’t tested whether it is genuinely all services, or just those in your namespace, but even still, it’s a lot of environment variables that your application code might not be expecting.
So my gotty application was seeing the environment variable GOTTY_PORT defined as theTCP URI
tcp://172.20.244.217:8080. Expecting the variable to contain a port number, it simply appended the whole URI to the IP address (which defaulted to 0.0.0.0).
An isolated case?
It seems to me likely that a lot of applications would behave similarly to gotty, expecting a port number to be defined as APP-NAME_PORT. And they will all be similarly confused by the environment variables defined by Kubernetes, if you put them behind a service called APP_NAME.
You have been warned.
What should happen?
I’m guessing that these environment variables won’t be going away any time soon for back-compatibility, but given the prevalence of DNS-based service discovery and the likelihood of application code that misinterprets the environment variables, should there be an option to disable them? Am I missing something?
I’ve submitted a PR to gotty so that it can spot the TCP URI that might be supplied by Kubernetes, and behave rationally when that happens to extract just the port number from the URI.