Linux: Total swap used = swap used by processes +

linuxmemoryswap

So, I'm trying to do some investigation on where does swap use come from in a system with high swap usage:

# free
             total       used       free     shared    buffers     cached
Mem:        515324     508800       6524          0       4852      27576
-/+ buffers/cache:     476372      38952
Swap:       983032     503328     479704

Adding up swap used per process:

# for proc in /proc/*; do cat $proc/smaps 2>/dev/null | awk '/Swap/{swap+=$2}END{print swap "\t'`readlink $proc/exe`'"}'; done | sort -n | awk '{total+=$1}/[0-9]/;END{print total "\tTotal"}'
0       /bin/gawk
0       /bin/sort
0       /usr/bin/readlink
28      /sbin/xxxxxxxx
52      /sbin/mingetty
52      /sbin/mingetty
52      /sbin/mingetty
52      /sbin/mingetty
56      /sbin/mingetty
56      /sbin/mingetty
60      /xxxxxxxxxxx
60      /usr/sbin/xxx
84      /usr/sbin/xxx
108     /usr/bin/xxx
168     /bin/bash
220     /sbin/init
256     /sbin/rsyslogd
352     /bin/bash
356     /bin/bash
360     /usr/sbin/sshd
496     /usr/sbin/crond
672     /usr/sbin/sshd
12972   /opt/jdk1.6.0_22/bin/java
80392   /usr/libexec/mysqld
311876  /opt/jdk1.6.0_22/bin/java
408780  Total

Which gives a lower value for total used swap. Where is the remaining used swapspace? Is it vmalloc()'ed memory inside the kernel? Something else? How can I identify it?

Output of meminfo:

# cat /proc/meminfo 
MemTotal:       515324 kB
MemFree:          6696 kB
Buffers:          5084 kB
Cached:          28056 kB
SwapCached:     157512 kB
Active:         429372 kB
Inactive:        65068 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       515324 kB
LowFree:          6696 kB
SwapTotal:      983032 kB
SwapFree:       478712 kB
Dirty:             100 kB
Writeback:           0 kB
AnonPages:      399456 kB
Mapped:           8792 kB
Slab:             7744 kB
PageTables:       1820 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:   1240692 kB
Committed_AS:  1743904 kB
VmallocTotal:   507896 kB
VmallocUsed:      3088 kB
VmallocChunk:   504288 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     4096 kB

Best Answer

The difference you are observing isn't actually due to swap space being unaccounted for. The "(deleted)" that the kernel sometimes appends to /proc/*/exe links is output by readlink and is causing parse errors in your awk script, and you are effectively not counting processes whose binaries are no longer present in your total.

Some kernels append the word "(deleted)" to /proc/*/exe symlink targets when the original executable for the process is no longer around.

The reason your command is showing less than the total is because of this. The output of readlink on such links will be something like "/path/to/bin (deleted)", which causes a parse error in awk when the output is substituted back into the string (it doesn't like the parentheses and spaces). For example, do this:

for a in /proc/*/exe ; do readlink $a ; done | grep deleted

And you will see a few entries with "(deleted)" appended. If you looked at the swap usage for these entries, their total would match the discrepancy you see, as the resulting awk errors prevent their totals from being calculated and included in the final total.

If you run your original command without redirecting stderr anywhere, you will probably notice a few "runaway string constant" errors. Those errors are a result of the above and you should not have ignored them.

Ignoring other potential improvements to your original command, you could modify it by removing the " (deleted)", like this (note |awk '{print $1}' added to readlink output):

for proc in /proc/*; \
  do cat $proc/smaps 2>/dev/null | awk '/Swap/{swap+=$2}END{print swap "\t'`readlink $proc/exe|awk '{print $1}' `'" }'; \
done | sort -n | awk '{total+=$1}/[0-9]/;END{print total "\tTotal"}'

This use of awk to fix the output of readlink may break if the name contains spaces -- you can use sed or whatever method you prefer.

Bonus Info

By the way, you could just use smem -t. The "Swap" column displays what you want.

As for calculating it yourself, though, you can also get this information more directly from the VmSwap field in /proc/*/status (smaps requires some kernel support and isn't always available), and avoid having to redirect error output by using a proper filename pattern that avoids the errors to begin with:

for proc in /proc/[0-9]*; do \
  awk '/VmSwap/ { print $2 "\t'`readlink $proc/exe | awk '{ print $1 }'`'" }' $proc/status; \
done | sort -n | awk '{ total += $1 ; print $0 } END { print total "\tTotal" }'

If you don't need the actual binary and can deal with just having the process name, you can get everything from status:

for a in /proc/*/status ; do \
  awk '/VmSwap|Name/ { printf $2 " " } END { print "" }' $a ; \
done | awk '{ total+=$2 ; print $0 } END { print "Total " total }'

And finally, if just having the PIDs suffices, you can just do it all with awk:

awk '/VmSwap/ { total += $2; print $2 "\t" FILENAME } END { print total "\tTotal" }' /proc/*/status

Note:

Now this isn't to say that there aren't differences between free and smem (the latter being the same as your script). There are plenty (see, for example, https://www.google.com/search?q=smem+free, which has more than enough results on the first page to answer your questions about memory usage). But without a proper test, your specific situation cannot be addressed.

Related Question