MacOS – Troubleshooting DNS

dnsmacosNetwork

I'm having a weird problem where the system-wide DNS resolution doesn't work, but I don't know how I could fix that, or even find a log (coming from Linux).
I manually configured 8.8.8.8, 8.8.4.4 as DNS servers in the GUI, which seems to have taken:

$ scutil --dns
DNS configuration

resolver #1
  search domain[0] : Home
  nameserver[0] : 8.8.8.8
  nameserver[1] : 8.8.4.4
  flags    : Request A records
  reach    : Reachable

DNS configuration (for scoped queries)

resolver #1
  search domain[0] : Home
  nameserver[0] : 8.8.8.8
  nameserver[1] : 8.8.4.4
  if_index : 4 (en0)
  flags    : Scoped, Request A records
  reach    : Reachable

However when the system tries to resolve a name it fails with a timeout, only some software i.e. Chrome which doesn't use the system resolver is not affected:

$ ping google.com
ping: cannot resolve google.com: Unknown host

$ scutil -r google.com
Not Reachable

They can be queried manually:

$ nslookup google.com
Server:     8.8.8.8
Address:    8.8.8.8#53

Non-authoritative answer:
Name:   google.com
Address: 2.127.237.183
...

$ dig google.com
google.com.     50  IN  A   2.127.237.183
;; Query time: 226 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)

And the results are valid:

$ ping 2.127.237.183
64 bytes from 2.127.237.183: icmp_seq=0 ttl=60 time=37.086 ms

$ scutil -r 2.127.237.183
Reachable

My hosts file doesn't contain anything surprising:

$ cat /etc/hosts
##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting.  Do not change this entry.
##
127.0.0.1   localhost
255.255.255.255 broadcasthost
::1             localhost

Requesting a new DHCP lease didn't do anything either. Resetting the servers doesn't change anything:

$ networksetup -getinfo Wi-Fi
DHCP Configuration
IP address: 192.168.0.2
Subnet mask: 255.255.255.0
Router: 192.168.0.1
Client ID:
IPv6: Automatic
IPv6 IP address: none
IPv6 Router: none

$ networksetup -setdnsservers Wi-Fi Empty

$ scutil --dns
DNS configuration

resolver #1
  search domain[0] : Home
  nameserver[0] : 192.168.0.1
  if_index : 4 (en0)
  flags    : Request A records
  reach    : Reachable,Directly Reachable Address

DNS configuration (for scoped queries)

resolver #1
  search domain[0] : Home
  nameserver[0] : 192.168.0.1
  if_index : 4 (en0)
  flags    : Scoped, Request A records
  reach    : Reachable,Directly Reachable Address

$ scutil -r google.com
Not Reachable

The logs available in Console.app mostly show apps complaining about timeouts (I think this is especially weird: the resolution doesn't fail immediately because there's no server available, but it always fails with a timeout, as if it tries to reach them but can't?)

Unlike on Linux, dig/nslookup don't appear to use the system resolver all the other apps/services are using. Is there a tool that uses the system resolver and has some options to tell me what's wrong?

Best Answer

I don't know what might cause a problem like this, but I can give you some troubleshooting pointers.

  • First, try doing a manual query to 8.8.4.4 (dig google.com @8.8.4.4) -- dig, nslookup, and host all seem to use the first listed server, but the system resolver uses a weird round-robin-ish system that'll fail intermittently if some of the configured DNS servers don't work right. Similarly, you might try configuring the OS to use just 8.8.8.8 and see if that changes anything.

  • Speaking of the system resolver, it's possible it's gotten into some weird state, so restarting it may clear the problem. Actually, I'd reset both opendirectoryd (which dispatches all kinds of lookups) and mDNSResponder (which actually does the DNS part), just in case. sudo killall opendirectoryd mDNSResolver should do the trick. Note that both daemons will be restarted automatically.

  • You can get more info out of mDNSResponder by sending it signals. Probably the most useful is the packet logging feature, which makes it log each DNS packet sent & received to /var/log/system.log. You can toggle it on & off with sudo killall -USR2 mDNSResponder. The log entries should look something like this (for a successful lookup, that is):

    -- Sent UDP DNS Query (flags 0100) RCODE: NoErr (0) RD ID: 28215 25 bytes from port 61186 to 172.20.0.1:53 --
     1 Questions
     0 scanme.insecure.net. Addr
     0 Answers
     0 Authorities
     0 Additionals
    --------------
    -- Received UDP DNS Response (flags 8180) RCODE: NoErr (0) RD RA ID: 28215 272 bytes from 172.20.0.1:53 to 172.20.6.67:61186 --
     1 Questions
     0 scanme.insecure.net. Addr
     1 Answers
     0 TTL    3600    4 scanme.insecure.net. Addr 5.45.96.131
     4 Authorities
     0 TTL   86400   17 insecure.net. NS ns3.eurodns.com.
     1 TTL   86400   17 insecure.net. NS ns2.eurodns.com.
     2 TTL   86400   17 insecure.net. NS ns4.eurodns.com.
     3 TTL   86400   17 insecure.net. NS ns1.eurodns.com.
     7 Additionals
     0 TTL    3600    4 ns1.eurodns.com. Addr 80.92.65.2
     1 TTL    3600   16 ns1.eurodns.com. AAAA 2001:0B20:1001:0004:0000:0000:0000:0002
     2 TTL    3600    4 ns2.eurodns.com. Addr 80.92.89.242
     3 TTL    3600   16 ns2.eurodns.com. AAAA 2001:0B20:1001:0011:0000:0000:0000:0242
     4 TTL     600    4 ns3.eurodns.com. Addr 80.92.95.42
     5 TTL    3600    4 ns4.eurodns.com. Addr 192.174.68.100
     6 TTL    3600   16 ns4.eurodns.com. AAAA 2001:067C:01BC:0000:0000:0000:0000:0100
    --------------
    

    You can also send it a USR1 signal to turn on debug logging (which seems to require you to know a lot about mDNSResponder's internals to make sense of), and the INFO signal makes it dump its internal state into the system log (probably informative, but lots of info to sort through).