The DNS

Jemutai Sitienei
4 min readFeb 12, 2020

--

It’s surprising how much work goes into accessing a simple site like “mit.edu”. For the internet infrastructure, the identifier “mit.edu” is garbage — the readable hostnames have to be translated to internet addresses first to enable data transmission. Some form of name resolution mechanism is therefore necessary to resolve host or domain names into internet addresses.

Why do we use hostnames in the first place?

  • User friendliness: Imagine if you had to key in 18.9.22.69 instead of “mit.edu” . While this may work for some of the domain names for which you’re familiar, the scale of the number of internet addresses would probably make it hard to remember all the numbers
  • What if you want the name “mit.edu” to map to multiple machines for load distribution in case “mit.edu” gets many hits or to increase performance? What abstraction would make this easier?

This is where the Domain Name System (DNS) comes in.

The Domain Name System is a critical part of the internet.

“It hierarchically distributes the management of names among different naming authorities and also hierarchically distributes name resolution among different name servers and name management among different naming authorities. In computer science it is an excellent case study for naming schemes and hierarchical distribution.” (Saltzer and Kaashoek, Principles of computer system design. )

Main purpose of DNS is to associate domain names with IP addresses.

How the DNS works

The main components of the system are:

  • Key Value bindings mapping hostnames to IP adresses.
  • Name Servers

Key Value bindings are are managed in database tables which are loaded to DNS servers as often as their managers deem necessary — this means that changes to bindings can take up to hours to reflect.

Name Servers

Like their name suggests, name servers maintain a set of name binding records. When a name resolution request comes in, the server searches through its records for the binding. If it finds a record it returns it as a response. Otherwise, it looks through a separate set of referral records starting with the most significant component of the requested domain name to find the nearest name server responsible for the name space.

authoritative name server: server that holds either a name record or a referral record for a domain name.

Let’s walk through an example. Suppose we want to resolve the hostname web.mit.edu

Adapted from http://web.mit.edu/6.033

For the sake of simplicity lets assume the resolution request first goes to the root, which doesn’t necessarily have the IP for web.mit.edu. It respond with the IP address of the .edu server which may potentially be responsible for web.mit.edu. A request to the .edu name server afterwards responds with an IP for .mit authoritative name server which finally responds with the IP of web.mit.edu

Modifications to improve DNS performance

Of course if all resolutions had to go through the root, then it would be a bottleneck for the system. DNS provides a couple performance improvement features

  • Clients Don’t really have to send requests to the root name server : requests can be sent to any convenient name server that the
  • Recursive resolution: name servers recursively resolve names instead of responding with referral records.
  • Caching: Storing addresses gathered so far in local tables to reduce the amount of work. Cache expirations ensure that the table doesn’t get too large and to reduce stale cache tables.
  • Clients know about new name servers — multiple name servers authoritative for a domain space

Some things for you to think about

  • Do you ever think about the resolution of the hostnames you type in your browser? Do you trust the resolution given by a name server? What if I could intercept your queries to DNS name servers and send you a wrong response?
  • Would it be possible to authenticate queries like these?
  • Do you think the DNS is a successful system?

The degree to which the DNS has scaled is pretty remarkable considering the system was designed in the ‘80’s. It also works pretty well in the face of an unreliable network. The biggest drawback however is it’s security — it opens up a can of threat worms including denial of service attacks among others — but this is a story for another day :)

--

--