Reduce DNS Resolution Time for 10,000 Pods on EKS

Quân Huỳnh
2 min readMar 5, 2024

Kubernetes Tips

Problem

When querying the DNS of a domain that is not a Fully Qualified Domain Name (FQDN), CoreDNS will traverse the entire search path until it finds a match. According to DNS standards, a domain is considered an FQDN when the number of dots (.) in the domain equals the value of ndot or has a . at the end.

By default, the ndot value in EKS is 5. Therefore, for example, if we query the DNS of a domain named amazon.com, CoreDNS will query from top to bottom.

amazon.com.default.svc.cluster.local

amazon.com.svc.cluster.local

amazon.com.cluster.local

amazon.com.

If the number of Pods is small, there is no issue, but as our system expands, it will lead to a large number of DNS queries ⇒ core-dns becomes slow or errors. For example, when the number of Pods reaches 10000, maybe the peak time rises to 3000ms.

Solution

One simple solution to this issue is to add a . at the end of the domain, at this point, our domain will be in FQDN format ⇒ core-dns does not need to traverse the entire search path. For example, instead of using amazon.com, we should use amazon.com.. Now the DNS query time in our Pod will decrease up to 70%.

Another solution is to use NodeLocal DNSCache to avoid Pods running too many DNS queries.

--

--

Quân Huỳnh

I’m a technical blogger and love writing. I write about Kubernetes, AWS Cloud, and Terraform.