An Analysis of the Tor DNS Landscape.
Unlike other relays, tor exit relays also take care of name resolution for tor clients. Their DNS configuration actually determines where the tor network’s DNS traffic is send to.
Ever since the tor-dns paper I wanted to take a look at the current state of DNS resolver distributions on the tor network. In their work from 2016 they noted that Google controls a significant share of tor’s DNS traffic.
Here is a plot of their data from 2015-2016 showing Google as the DNS resolver controlling the biggest chunk of the tor network’s DNS traffic:
Now the question is, how is the tor network doing two years after Philipp Winter et al. urged the tor relay operators to stop using Google’s DNS resolver?
With new players like Quad9 and Cloudflare on the “DNS resolver market” asking for your DNS traffic, who are the big DNS players on today's tor network?
The methodology and tools used to collect the DNS resolver data of exit relays is described in their paper. We only have a small modification to the methodology, we consider all resolver IP addresses we observed (not just the
first/primary IP address). So if an exit relay uses its ISPs resolver and i.e. Google we take both into account.
We collected DNS resolver data of about 89.2% of the tor network’s exit capacity (April 2018).
Analyzing Tor Exit DNS Resolver Data
In the first step we want to determine how much of the tor network by exit capacity is using at least one of the prominent public DNS resolvers that we expected to see in the data set:
All but Quad9 publish their source IP addresses that their resolvers use for outbound queries, so we can use that public data to attribute source IP addresses to these companies without false-positives or false-negatives. Note: If you send a DNS query to for example Google (22.214.171.124), Google will not use 126.96.36.199 as a source IP address for outbound queries to authoritative name servers, for that reason we need their IP address ranges for proper attribution. In case of Quad9 there is no public information about the IP addresses they use for outbound traffic, but their support was so kind to provide us with their huge excel file ;)
This first step gives us the following distribution (note that the actual numbers are likely higher than the one shown here since we do not have data of 100% of tor exit capacity):
In the next step we sort the remaining resolver IP addresses — that were not attributed to any of these four known resolvers — into two categories:
- same-AS: resolver IP address is located in the same autonomous system as the exit relay’s primary IPv4 address. This is usually the case if the exit uses its provider’s DNS resolver or runs its own local resolver (recommended, short path to the resolver)
- remote-AS: resolver IP address is located outside the exit’s autonomous system (less desirable, longer path to the resolver)
Note: An exit relay can be in both categories at the same time since it can use multiple resolvers and their resolvers in turn can be forwarders sending their queries to multiple upstream resolvers before the query reaches the authoritative nameservers.
Within the remote-AS category we aim to identify unknown frequently used resolvers by grouping entries per autonomous system number (of the resolver IP address). The following two autonomous systems come up on the top of the list (biggest exit capacity share):
- AS1280 (ISC)
- AS3356 (Level 3)
A closer look at AS1280 shows that this is always the same resolver IP address (188.8.131.52). By looking at the exits using this resolver we notice that this is a single exit operator (the biggest on the network) which apparently used ISC as his primary resolver.
So we added these two ASes to the picture for a more complete overview:
In reality ISC’s share was likely around 10% because we only had data for about half of that operator’s exit relays and that operator runs about 10% of the tor network exit capacity.
The remaining other (~50) ASes in the remote-AS category are not to relevant as they get less than 1% of tor’s DNS traffic.
What does that data tell us?
Google still controls a significant fraction of tor’s DNS traffic so things did not improve much since 2016. ISC had a significant share (estimated at ~10%) due to a single big exit operator using it. A DNS resolver within Level 3’s AS appears to be used by multiple exit operators (update 2018–05–11: this is the resolver at 184.108.40.206). New resolvers like Quad9 and Cloudflare are also already among the biggest and now is probably a good time to rise awareness before they gain significant more traction.
Solving the Problem
- raise awareness among relay operators
We try to reach out to exit operators using any of the big remote-AS resolvers on their exits. If we manage to convince the 20 biggest Google-DNS-resolver-using exit operators to switch to a local resolver we can reduce Google’s share by over 12%, this sounds practically doable (minus those without working ContactInfo).
The recommendation to not use Google et al. on exit relays isn’t exactly new, we will need to be persistent and provide specific information to help reduce DNS centralization on the Tor network.
2. provide exit operators with actionable information
It might not always be obvious to exit operators what DNS resolvers actually get their DNS queries if their upstream DNS resolver is forwarding their queries to 3th parties like Google, OpenDNS, Quad9 or Cloudflare.
Exit operator can and should have a look at the following list to easily tell if they use(d) a resolver outside their own AS (beware false-negatives due to missing measurements).
Note: Under rare circumstances (exit relay is multi-homed in two ASes) it might not be an issue if the relay IP address is not in the same AS as the resolver IP address, but we don’t expect that to be often the case.
3. provide practical solutions for exit operators
We added simple and straight forward step by step instructions on how to configure a local DNSSEC-validating and caching resolver on an exit relay to the Tor Relay guide.
In ansible-relayor we blacklisted the biggest known DNS resolvers to encourage operators to diversify their resolver selection (we abort if we detect a known big resolver in /etc/resolv.conf).
4. tor could write a warning to its log file when it detects common DNS resolvers and it is configured to be an exit relay
5. Add DNS related information to Relay Search (a long term item)
It would be nice and probably effective to have information about DNS resolvers show up on Relay Search, because it is a popular tool for relay operators to check on their relay state. Operators could easily see if they use any less desirable DNS resolvers if that information is shown on Relay Search. That way we could even reach operators who have no or invalid ContactInfo data, but multiple steps are required before this could happen:
The currently unavailable data needs to be
The goal is to be bellow the following thresholds within one year:
- not have any single remoteAS entity control more than 10% exit capacity
- reduce the overall remoteAS share to bellow 20% exit capacity
We will revisit after 2019–05–01.
Ideas for future work
- continuous monitoring and alerting. We would like to measure the Tor DNS landscape on an ongoing basis but we depend on a torsocks feature to actually do it. The goal would be to continuously monitor the state of DNS of the tor network and graph the data to better understand trends and how the landscape is changing. With continuous measurements we could setup alerts that send out emails as soon as a given DNS resolver’s share rises above — say 1%. A future revival of Tor Weather could also include such a check.
- measure additional properties. We would also like to measure additional DNS related metrics to answer questions like: How many exits have QNAME minimization enabled?
- increase measurement coverage
This time we had data on about 89% of the tor exit capacity, we should have more than 99% coverage in the future.
This blog post actually ends here but there is some bonus material bellow.
What other information can we learn from this data?
Undeclared Relay Group Detection
The data also appears to be useful to detect undeclared relay families because their (potentially unique) set of DNS resolvers might link multiple exits (now that detection is spoiled ;)
Are there any DNS resolvers with a higher failure rate than others?
In the past (unrelated to these measurement) while we were contacting exit operators about their exits failing a significant portion of DNS resolution attempts we noticed that on more than on one occasion the
exit operator used Google’s DNS resolver. So we were wondering: Have exit relays who use Google’s DNS resolver a higher failure rate when compared to exits not using Google’s DNS resolver? I’d say I can see some correlation but I’d like to have more data before drawing any conclusions.
The same-AS Category
Out of curiosity we also took a short look at the same-AS category by sub-categorizing these into:
Exits where all DNS queries came from the exit’s primary IPv4 address only. This category has known false-negatives due to missing IPv6 exit IP address data.
- same /24 netblock
Exits where all resolver IPs are located in the same /24 network block excluding the exit’s own IP (sameIP).
Exits where all resolver IPs are within the exit’s autonomous system.
(excluding same /24 netblock and sameIP exits)
And if you wonder what is the distribution within the sameAS category:
This includes the usual big ASes:
What about DNS-over-TLS to protect DNS traffic from preying eyes?
If there were >200 organizations providing such resolvers I’d consider it but since there are only very little providers offering it, recommending DNS-over-TLS to the entire tor community would result in centralization towards a handful of operators — the very effect we are trying to avoid.