Dangling DNS Records are a Real Vulnerability

This post is inspired by the investigation on identifying dangling domain from Amazon’s Elastic IP pool by Matthew Bryant and the formal study done by Liu et. al. in their CCS 2016 paper. I encourage you to read their paper if you are interested in learning this problem in detail and how to fix it before bad actors exploit this to attack your systems.

Before I show you the issue, let me get some basics in order.

DNS is the trust anchor of Internet:

DNS (Domain Name System) not only provides vital naming services, but also the fundamental trust anchor for accessing services via Internet. If DNS is broken, we will not be sure if we are accessing a true service or a malicious service masquerading as a true service. DNS has been an attractive target to attackers and security community has spent quite a lot of effort to protect the interactions between client and server during DNS resolutions. One such effort is DNSSEC.

How DNS resolution works:

DNS is structured as a tree — from root, top level, second level and so on domains.

(Source: paper)

Based on this hierarchy, DNS resolution happens according to this workflow shown below (assume there is a cache miss at rDNS — i.e. resolution results are not cached in the rDNS) when you type www.foo.com in your browser.

(Source: talk)

If DNSSEC protects this communication, aren’t we safe?

No! While DNSSEC protects the interactions between clients and servers, there is no real authentication of the links between DNS records and those resources to which DNS records point to. We are not sure if those resources are still in used by the intended parties. In other words, the lack of authenticity checking of resolved resources could lead to domain hijacking attacks. Resource authentication in DNS zone files are largely overlooked due to complexity. A resource may be authenticated for the first time, but later on the resource is not used or abandoned. If a DNS record is pointing to an ephemeral resource and the resource is no longer used by the original party to whom the DNS record was added, an attacker may acquire the resource and effectively control the domain. Such records are analogous to uncleaned pointers in C programs. One good ephemeral resource example is elastic IPs provided by Cloud Service Providers such as Amazon AWS. Conventional wisdom is that these pointers are harmless and safe, but it turns out that it is not the case.

Dangling DNS A Record (Source: paper)

Once you are done using an IP, you release the resource back to the elastic IP pool so that someone else could use that IP. If you forgot to remove any DNS records directly or indirectly (e.g. via CNAME) pointing to this IP, an attacker who gets hold of the same IP cloud control the domain you have mentioned in your DNS records. The scary part of the attack is that a bad actor can control your domain without changing a single bit in the DNS zone files belonging to you. Thus, for such attacks, defenses based on changes to the zone file are utterly ineffective.

Example (I control your domain!)

Let me explain using an example:

Let’s say you have the following DNS record in your zone file:

www.foo.com. A 3.3.3.3

So, we have an A record in your aDNS pointing to the Elastic IP 3.3.3.3 obtained from the cloud service provider SuperCloud. Everything is kicking and well so long as you are using the resource IP. Now you decide that you no longer need this IP and let SuperCloud to move it to the available pool of IPs. However, you forgot to clean up your aDNS records to get rid of the above Resource Record. Assume that 3.3.3.3 is not yet acquired by anyone from SuperCloud service — i.e. that IP is still waiting there in the elastic pool to be used by some user. Now if you try to access www.foo.com, it still resolves to 3.3.3.3 but you get an error when trying to access the resource. This gives the indication to an attacker that there is a dangling DNS record lying around. The attacker can repeatedly make IP reservations at SuperCloud until it gets the IP 3.3.3.3. Now the attacker can essentially control the domain www.foo.com.

Another Example (I control your domain again!)

One more example in order. Now for a CNAME record. Let’s say you have the following DNS record in your zone file:

pretty.name.com CNAME ugly.name.com

You have bought the domain ugly.name.com from the domain registrar GoMommy. You added the above CNAME record so that your visitors can use the pretty.name.com to access your website. All is well initially but an year later, you decided not to renew your domain ugly.name.com as you don’t get many visitors to your pretty.name.com, but you did not clean up your RR in your aDNS and it is lying around as a dangling pointer. An attacker spotting this link, may buy ugly.name.com from GoMommy and control the domain pretty.name.com. Similar to the previous attack, attacker exploits the existing dangling DNS record and no modification is done to the zone file. Attackers get their hands on the resources that dangling DNS records point to!

What are the attack vectors?

Are only elastic IP pools and expired domains lead to such attacks. No, there could be other attack vectors. Another example is abandoned third-party services.

What kind of DNS records create this problem?

Not all dangling records are potentially unsafe. There are at least 4 unsafe ones. In addition to the A and CNAME records mentioned above, we may see dangling DNS records that create potential security vulnerability via NS and MX records. Please read the paper mentioned above to get more details.

Potentially Unsafe Dangling Record Types (Source: paper)

Why dangling DNS records are not removed from DNS zone files?

There could be many reasons for this. One of the most common reason is that not removing such records does not break the DNS functionality. Take for an example that you have a 2-node (mail1.super.com and mail2.super.com) load balanced email server solution. For some reason, assume that one domain (mail1.super.com) gets expired. DNS resolution to the mail1.super.com always fails and falls back to mail2.super.com. The point is the the mail server resolution continues to function even though one mail server is not utilized. It does not break the functionality (even though in this case it degrades the performance). Since it does not break the functionality, many administrators take the approach “do not touch your system, if it works” and thus do not clean up dangling records.

super.com. 60 MX 10 mail1.super.com.

super.com 60 MX 20 mail2.super.com.

(First return mail1.super.com, if it is not available return mail2.super.com.)

Further, it is widely thought that dangling DNS records are safe and they do not open the door for domain hijacking. However, as shown above, it is not the case.

It is not easy to identify dangling records. There is no mechanism currently available to periodically check for authenticity of DNS records. This may be an interesting future direction.

How widespread is the problem?

It is not clear how widespread the problem is. It would be some effort to measure them. However, the paper shared above gives a glimpse of the issue especially with the increased utilization of elastic cloud provisioning. It at least look to worsen (not improve) with the increased utilization of ephemeral resources.

What can we do about it?

The core question is how we can protect the integrity and authenticity of DNS records returned to clients. The underlying problem is that resources protected by DNS records are ephemeral in nature — ephemeral elastic IPs, ephemeral domain names, etc. Unless if we periodically check to verify the authenticity, it is a quite difficult problem to spot dangling records.

Cloud service providers may come up with techniques to prevent users from repeatedly acquiring (and then later releasing), IPs from the elastic IP pool. The challenge here is to identify malicious use cases against real use cases where elasticity is a business requirement.

Interesting Open Issues:

How can we identify dangling records already exploited by bad actors?

How can we efficiently authenticate resources pointed to by DNS records?

As a good administrative practice, always clean up your DNS zone files whenever there is a change in resources. It is easier said than done, but having such a policy could go a long way with respect to DNS record hygiene!