Linux – how to check rx ring, max_backlog, and max_syn_backlog size

kernellinuxnetworkingtcp

Quite often in the course of troubleshooting and tuning things I find myself thinking about the following Linux kernel settings:

net.core.netdev_max_backlog
net.ipv4.tcp_max_syn_backlog
net.core.somaxconn

Other than fs.file-max, net.ipv4.ip_local_port_range, net.core.rmem_max, net.core.wmem_max, net.ipv4.tcp_rmem, and net.ipv4.tcp_wmem, they seems to be the important knobs to mess with when you are tuning a box for high levels of concurrency.

My question: How can I check to see how many items are in each of those queues ? Usually people just set them super high, but I would like to log those queue sizes to help predict future failure and catch issues before they manifest in a user noticeable way.

Best Answer

I too have wondered this and was motivated by your question!

I've collected how close I could come to each of the queues you listed with some information related to each. I welcome comments/feedback, any improvement to monitoring makes things easier to manage!

net.core.somaxconn

net.ipv4.tcp_max_syn_backlog

net.core.netdev_max_backlog

$ netstat -an | grep -c SYN_RECV 

Will show the current global count of connections in the queue, you can break this up per port and put this in exec statements in snmpd.conf if you wanted to poll it from a monitoring application.

From:

netstat -s

These will show you how often you are seeing requests from the queue:

146533724 packets directly received from backlog
TCPBacklogDrop: 1029
3805 packets collapsed in receive queue due to low socket buffer

fs.file-max

From:

http://linux.die.net/man/5/proc

$ cat /proc/sys/fs/file-nr
2720    0       197774

This (read-only) file gives the number of files presently opened. It contains three numbers: The number of allocated file handles, the number of free file handles and the maximum number of file handles.

net.ipv4.ip_local_port_range

If you can build an exclusion list of services (netstat -an | grep LISTEN) then you can deduce how many connections are being used for ephemeral activity:

netstat -an | egrep -v "MYIP.(PORTS|IN|LISTEN)"  | wc -l

Should also monitor (from SNMP):

TCP-MIB::tcpCurrEstab.0

It may also be interesting to collect stats about all the states seen in this tree(established/time_wait/fin_wait/etc):

TCP-MIB::tcpConnState.*

net.core.rmem_max

net.core.wmem_max

You'd have to dtrace/strace your system for setsockopt requests. I don't think stats for these requests are tracked otherwise. This isn't really a value that changes from my understanding. The application you've deployed will probably ask for a standard amount. I think you could 'profile' your application with strace and configure this value accordingly. (discuss?)

net.ipv4.tcp_rmem

net.ipv4.tcp_wmem

To track how close you are to the limit you would have to look at the average and max from the tx_queue and rx_queue fields from (on a regular basis):

# cat /proc/net/tcp
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode                                                     
   0: 00000000:0FB1 00000000:0000 0A 00000000:00000000 00:00000000 00000000   500        0 262030037 1 ffff810759630d80 3000 0 0 2 -1                
   1: 00000000:A133 00000000:0000 0A 00000000:00000000 00:00000000 00000000   500        0 262029925 1 ffff81076d1958c0 3000 0 0 2 -1                

To track errors related to this:

# netstat -s
    40 packets pruned from receive queue because of socket buffer overrun

Should also be monitoring the global 'buffer' pool (via SNMP):

HOST-RESOURCES-MIB::hrStorageDescr.1 = STRING: Memory Buffers
HOST-RESOURCES-MIB::hrStorageSize.1 = INTEGER: 74172456
HOST-RESOURCES-MIB::hrStorageUsed.1 = INTEGER: 51629704
Related Question