Automatic public DNS for Fargate-managed containers in Amazon ECS

5 min readDec 20, 2018

For development and demo purposes, we deploy our micro-services application in Amazon Elastic Container Service (ECS) with Fargate. We want each container to be always accessible on the same public domain after redeployment. Unfortunately ECS does not support registering services into public DNS namespaces.

In his article Update IP-Address in Route53 on ECS Fargate redeployments Marc Logemann came up with an approach using a lambda function running continuously as a CRON job to update Route 53 DNS records with the public IP addresses of running containers.

Event-based approach on updating DNS records

While the CRON job solution is certainly valid, we think that an event based and more generic approach to automated DNS update would be a better fit.

What we want is a lambda function that gets fired whenever a new container is deployed. The function should then be able to register the public IP of the container in Route 53 for a subdomain derived from the containers’ service name.

Setting up the lambda function

We want the function to be reactive. So we use CloudWatch events as trigger. ECS emits 2 types of events:

ECS Container Instance State Change
ECS Task State Change

The best way to react to something like “container deployed” is to subscribe to the “ECS Task State Change” event. Additionally, we use the tasks’ fields desiredStatus and lastStatusas a filter to only get notified for a finished deployment.

To configure and deploy the lambda function we use awesome Serverless Framework. Our serverless.yml looks like this:

...
functions:
  registerPublicIp:
    memorySize: 128
    handler: src/update-task-dns.handler
    events:
      - cloudwatchEvent:
          event:
            source:
              - "aws.ecs"
            detail-type:
              - "ECS Task State Change"
            detail:
              desiredStatus:
                - "RUNNING"
              lastStatus:
                - "RUNNING"
...

If you wonder why desiredStatus and lastStatus both require state “RUNNING”: For us this was the best filter, but there may be better ones. The guess is, that the network interface (ENI) is attached to the container in the very last step — but the container is already running at that time.

Configuring the ECS cluster

We want our function to be as generic as possible. It should be able to register public IP addresses for any container no matter in which cluster it resides. Fortunately AWS provides tags for clusters. So we tag our cluster with basic information the lambda function needs to register the public IP of the started containers:

the Route 53 hosted zone id
the domain (base) name for containers in the cluster

Tagging the ECS cluster with hosted zone id and domain

Implementing the lambda function

The source code is based on Marc Logemann’s implementation. We rewrote it to meet our generic event-driven approach. This is how it works:

When the function is triggered, it gets the ECS task object as parameter. What it does then is

Fetch the public IP of the container
Construct the containers’ (sub-)domain
Create/Update DNS record of the constructed domain with the public IP

For brevity, we only show the main function here — probably enough to get the idea. Please check out the whole source code at github if you want to get into the details.

exports.handler = async (event, context, callback) => {
    const task = event.detail;
    const clusterArn = task.clusterArn;
    const clusterName = clusterArn.split(':cluster/')[1];    const tags = await fetchClusterTags(clusterArn)
    const domain = tags['domain']
    const hostedZoneId = tags['hostedZoneId']
    const eniId = getEniId(task);
    
    const taskPublicIp = await fetchEniPublicIp(eniId)
    const serviceName = task.group.split(":")[1]
    const containerDomain = `${serviceName}.${domain}`
    const recordSet = createRecordSet(containerDomain, taskPublicIp)
    
    await updateDnsRecord(clusterName, hostedZoneId, recordSet)
};
...

To deploy the function, simply run serverless deploy.

Demo: Starting a container

When your function is deployed:

make sure you have a working public hosted zone in Route 53
your ECS cluster is tagged with the hosted zone id and domain name

Now start a new (or restart an existing) task in your tagged cluster. When the task has started up, check the public IP in the task details:

Check the public IP of the (re-)started task

When the lambda function has been invoked correctly, the container should now be available at the domain{service-name}.{domain}(in our case nginx.fargate-dns-demo.com). To prove that, open the hosted zone in Route 53 console. There should be a new (or updated) record set for your container pointing to the new public IP of the container:

Updated A-record set pointing to the container public IP

Limitations

Clearly, this approach is intended for non-production use cases. There is no DoS or any other kind of fraud protection when using Route 53 directly with your containers.

Also, there is no load balancing or scaling with this approach. It only works for the standard use case of ECS services with exactly one container instance, since the subdomain is constructed using the containers’ service name. For a multi-container per service scenario you would have to use the container name instead of the service name in your script.

Conclusion

Our CloudWatch triggered lambda solution to update DNS records on container deployment is fairly simple and straightforward — and works very well for our use cases. Of course it would be desirable to have this as a build-in feature in ECS — we still don’t know why Amazon doesn’t allow to use public namespaces with ECS Service Registry, but they certainly have their reasons. Once they decide to make it possible, our approach will be obsolete. But for the time being, it may help you to make your ephemeral Fargate containers addressable via fixed domains.

Source code available at https://github.com/foby/aws-ecs-public-dns