Linux – the difference between sock->sk_wmem_alloc and sock->sk_wmem_queued

kernellinuxlinux-kernelnetworking

The sock struct defined in sock.h, has two attributes that seem very similar:

  • sk_wmem_alloc, which is defined as "transmit queue bytes committed"
  • sk_wmem_queued, defined as "persistent queue size"

To me, the sk_wmem_alloc is the amount of memory currently allocated for the send queue. But then, what is sk_wmem_queued?

References

  • According to this StackOverflow answer:

    wmem_queued: the amount of memory used by the socket send buffer queued in the transmit queue and are either not yet sent out or not yet acknowledged.

  • The ss man also gives definitions, which don't really enlighten me (I don't understand what the IP layer has to do with this):

    wmem_alloc: the memory used for sending packet (which has been sent to layer 3)
    wmem_queued: the memory allocated for sending packet (which has not been sent to layer 3)

  • Someone already asked a similar question on the LKML, but got no answer
  • The sock_diag(7) man page also has its own definitions for these attributes:

    SK_MEMINFO_WMEM_ALLOC: The amount of data in send queue.
    SK_MEMINFO_WMEM_QUEUED: The amount of data queued by TCP, but not yet sent.

All these definitions are different, and none of them clearly explain how the _alloc and _queued variants are different.

Best Answer

I emailed Eric Dumazet, who contributes to the Linux network stack, and here is the answer:

sk_wmem_alloc tracks the number of bytes for skb queued after transport stack : qdisc layer and NIC TX ring buffers.

If you have 1 MB of data sitting in TCP write queue, not yet sent (cwnd limit), sk_wmem_queue will be about 1MB, but sk_wmem_alloc will be about 0

A very good document for understanding what these three types of queues (socket buffer, qdisc queue and device queue) are is this article (rather long) article. In a nutshell, the socket starts by pushing the packets directly onto the qdisc queue, which forwards them to the device queue. When the qdisc queue is full, the socket starts buffering the data in its own write queue.

the network stack places packets directly into the queueing discipline or else pushes back on the upper layers (eg socket buffer) if the queue is full

So basically: sk_wmem_queues is the memory used by the socket buffer (sock.sk_write_queue) while sk_wmem_alloc is the memory used by the packets in the qdisc and device queues.

Related Question