How does DNS name resolution work, in principle

dns

Right now I'm taking an online course for Linux sysadmin and I was asked a question I just generally don't understand. I know how to search for a name server, if I'm correct at least it's using the dig command to find the addressed in the additional section command, but I became a bit lost when asked the following question.

Assuming your configured nameserver does not have any cached results at its disposal, how many nameservers must your nameserver query in order to resolve maps.google.com? What command(s) would you use to find all these nameservers? List one from each level and explain why this level is needed.

I don't want the answer, I would just like to know what I am being asked to do exactly.

Best Answer

Assuming your configured nameserver does not have any cached results at its disposal, how many nameservers must your nameserver query in order to resolve maps.google.com? What command(s) would you use to find all these nameservers? List one from each level and explain why this level is needed.

Well, let's pick this one apart.

"Assuming your configured nameserver does not have any cached results at its disposal" -- first off, if it has no cached data at all, then it cannot resolve anything. In order to prime the resolver's cache, you need to have the NS and address (A, AAAA) records for the . (A.K.A. root) zone. That's the root name servers, which are found in the root-servers.net. zone. There's nothing magical about that zone or those DNS servers. However, this data is often provided "out of band" to the DNS resolver, precisely to prime the resolver's cache. Authoritative-only name servers don't need this data, but resolving name servers do.

Also, "resolve" to what? Any RRtype at that name? An A RR? Or something else? What class (CH/Chaosnet, IN/Internet, ...)? The exact process will be different, but the general idea remains the same.

If we can assume that we know how to find the root name servers but nothing more, and that by "resolve" we mean getting the contents of any IN A RRs associated with the name, it gets a lot more practical.

To resolve a DNS name, you basically split the name into labels and then work your way from right to left. Don't forget the . at the end; you'd really be resolving maps.google.com. rather than maps.google.com. That leaves us with needing to resolve (we know this, but a DNS resolver implementation probably won't):

  • .
  • com.
  • google.com.
  • maps.google.com.

Start with figuring out where to ask for the content of .. That's easy; we already have that information: the root name server names and IP addresses. So we have a root name server. Let's say we decide to use 198.41.0.4 (a.root-servers.net, also 2001:503:ba3e::2:30) to continue the name resolution. In practice, one of the first things done by the resolver will likely be to use the provided root server data to ask one of the root zone servers for an accurate list of the name servers for the root zone, thus ensuring that if any one of the names and IP addresses is valid and reachable, it'll have a full and complete set of data for the root zone when resolution begins.

Shoot off a DNS query for maps.google.com. IN A to 198.41.0.4. It'll tell you in response "nope, not gonna do it, but here's someone who might know"; that's a referral. It contains NS records for the closest zone that the server in question knows about, along with any glue records the server happens to have available. If no glue data is available, you first have to resolve that host named in the NS record you picked, so spawn a separate name resolution to get the IP address; if glue data is available, you'll have the IP address of a name server that is at least "closer" to the answer. In this case, that'll be the set of servers for the com. zone, and glue data is provided as well.

Repeat the process, asking one of the com. name servers the same question. They don't know either, but will refer you to Google's authoritative name servers. At this point in the general case it'll be hit or miss whether glue data is provided or not; there's nothing preventing a com domain to have name servers only in nl, for example, in which case glue data is unlikely to be available from the gTLD servers. The provided glue data might also be incomplete, or if you're really unlucky it might even be incorrect! You have to always be prepared to spawn off that separate name resolution I mentioned above.

Basically, you keep going until you get an answer with the aa (authoritative answer) flag set. That answer will tell you what you're asking for, or that the RR you asked for doesn't exist (either NXDOMAIN, or NOERROR with zero response data records). Keep looking out for responses like SERVFAIL (and back off one step and try another server if you get one; if all named servers return SERVFAIL, fail the name resolution process and return SERVFAIL yourself to the client).

The alternative to asking for the full RRname from each server (which might be considered bad practice) is to use the split-up list of labels that we determined earlier, ask the name servers given by the server further toward the root for IN NS and IN A/IN AAAA RRs for that label, and use those to further the name resolution process. That's only marginally different in practice, and the same process still applies.

You can simulate this entire process by using the +trace option to the dig utility, which comes as part of BIND, or set debug in nslookup.

It's also worth remembering that some RRtypes (notably NS, MX and a few others; also, A6 was reasonably well-used for a while but has been deprecated) can and do reference other RRs. In that case, you may need to spawn off yet another name resolution process to give a complete and useful reply to your client.

Related Question