From time to time Linux and Unix users faced with various network problems. Many of these problems are presented here and at some others troubleshooting forums, but they are very concrete and contains a lot of additional technical information, and sometimes it's rather difficult to understand the main point and the real reason of buggy system behavior.
By asking this question, my intention is to start a community wiki page which allows generalizing our network troubleshooting and debugging experience. I hope the Linux and Unix users could easier recognize and solve("divide and conquer") their network problems using this page.
The parent of this page should be Best practise to diagnose problems. But here we should focus on troubleshooting the network problems from user- and kernel-space.
I suppose, if you:
- Share the information about using some great network diagnostic tool with concrete usage examples and examples of network bugs, which they help to catch.
- Share the link to the great network tutorial connected with this subject
- Tell about a general method or recipe which allows to tackle some class of network problems
- Share information about your tool-set for network debugging and troubleshooting
it would perfectly fits for this topic.
I'll begin from sharing the link to varios diagnostic tools and 12-years old simple tutorial. Also archlinux tutorial seem to have actual information about our subject. And for diving into linux networking we definetely need to visit Linux Networking-HOWTO.
Best Answer
I think, general principles of network troubleshooting are:
As for me, I usually obtain all required information using all needed tools, and try to match this information to my experience. Deciding what level of network stack contains the bug helps to cut off unlikely variants. Using experience of other people helps to solve the problems quickly, but often it leads to situation, that I can solve some problem without its understanding and if this problem occurs again, it's impossible for me to tackle it again without the Internet.
And in general, I don't know how I solve network problems. It seems that there is some magic function in my brain named
SolveNetworkProblem(information_about_system_state, my_experience, people_experience)
, which could sometimes return exactly the right answer, and also could sometimes fail(like here TCP dies on a Linux laptop).I usually use utils from this set for network debugging:
ifconfig
(orip link
,ip addr
) - for obtaining information about network interfacesping
- for validating, if target host is accessible from my machine.ping
is also could be used for basic DNS diagnostics - we could ping host by IP-address or by its hostname and then decide if DNS works at all. And thentraceroute
ortracepath
ormtr
to look what's going on on the way there.dig
- diagnose everything DNSdmesg | less
ordmesg | tail
ordmesg | grep -i error
- for understanding what the Linux kernel thinks about some trouble.netstat -antp
+| grep smth
- my most popular usage of netstat command, which shows information about TCP connections. Often I perform some filtering using grep. See also the newss
command (fromiproute2
the new standard suite of Linux networking tools) andlsof
as inlsof -ai tcp -c some-cmd
.telnet <host> <port>
- is very useful for communicating with various TCP-services(e.g. on SMTP, HTTP protocols), also we could check general opportunity to connect to some TCP port.iptables-save
(on Linux) - to dump the full iptables tablesethtool
- get all the network interface card parameters (status of the link, speed, offload parameters...)socat
- the swiss army tool to test all network protocols (UDP, multicast, SCTP...). Especially useful (more so than telnet) with a few-d
options.iperf
- to test bandwidth availabilityopenssl
(s_client
,ocsp
,x509
...) to debug all SSL/TLS/PKI issues.wireshark
- the powerful tool for capturing and analyzing network traffic, which allows you to analyze and catch many network bugs.iftop
- show big users on the network/router.iptstate
(on Linux) - current view of the firewall's connection tracking.arp
(or the new (Linux)ip neigh
) - show the ARP-table status.route
or the newer (on Linux)ip route
- show the routing table status.strace
(ortruss
,dtrace
ortusc
depending on the system) - is useful tool which shows what system calls does the problem process, it also shows error codes(errno) when system calls fails. This information often says enough for understanding the system behavior and solving a problem. Alternatively, using breakpoints on some networking functions ingdb
can let you find out when they are made and with which arguments.iptables -nvL
shows how many packets are matched by each rule (iptables -Z
to zero the counters). TheLOG
target inserted in the firewall chains is useful to see which packets reach them and how they have already been transformed when they get there. To get furtherNFLOG
(associated withulogd
) will log the full packet.