Seeing another of your post I guess you are using zram. So that will be my assumption here.
I did the experience to install zram and consume lot of memory, and I got the same output of smem
than you. smem
does not take into account zram
into its counting, it only uses /proc/meminfo
to compute its value, and if you look and try to understand the code you will see that the zram RAM occupation is gets in the end counted under the noncache column of the kernel dynamic memory line.
Further investigations
Following my gut feeling that zram was behind this behavious, I setted up a VM with similar spec as your machine: 4 GB RAM and 2 GB zram swap, no swap file.
I have loaded the VM with heavy weight applications and got the following state:
huygens@ubuntu:~$ smem -wt -K ~/vmlinuz-3.2.0-38-generic.unpacked -R 4096M
Area Used Cache Noncache
firmware/hardware 130717 0 130717
kernel image 13951 0 13951
kernel dynamic memory 1063520 922172 141348
userspace memory 2534684 257136 2277548
free memory 451432 451432 0
----------------------------------------------------------
4194304 1630740 2563564
huygens@ubuntu:~$ free -m
total used free shared buffers cached
Mem: 3954 3528 426 0 79 858
-/+ buffers/cache: 2589 1365
Swap: 1977 0 1977
As you can see free
reports 858 MB cache memory and that is also what smem
seems to report within the cached kernel dynamic memory.
Then I further stressed the system using Chromium Browser. At the beginning, it was only have 83 MB of swap used. But then after a few more tabs opened, the swap switch quickly to almost it's maximum and I experienced OOM! zram
has really a dangerous side where wrongly configured (too big sizes) it can quickly hit you back like a trebuchet-like mechanism.
At that time I had the following outputs:
huygens@ubuntu:~$ smem -wt -K ~/vmlinuz-3.2.0-38-generic.unpacked -R 4096M
Area Used Cache Noncache
firmware/hardware 130717 0 130717
kernel image 13951 0 13951
kernel dynamic memory 1355344 124072 1231272
userspace memory 961004 36456 924548
free memory 1733288 1733288 0
----------------------------------------------------------
4194304 1893816 2300488
huygens@ubuntu:~$ free -m
total used free shared buffers cached
Mem: 3954 2256 1698 0 4 132
-/+ buffers/cache: 2118 1835
Swap: 1977 1750 227
See how the kernel dynamic memory (columns cache and non-cache) look like inverted? It is because in the first case, the kernel had "cached" memory such as reported by free
but then it had swap memory held by zram
which smem
does not know how to compute (check smem source code, zram occupation is not reported in /proc/meminfo, this it is not computed by smem
which does simple "total kernel mem" - "type of memory reported by meminfo that I know are cache", what it does not know is that in the computed total kernel mem it has added the size of the swap which is in RAM!)
When I was in this state, I activated a hard disk swap and turned off the zram swap and I reset the zram devices: echo 1 > /sys/block/zram0/reset
.
After that the noncache kernel memory melted like snow in summer and returned to "normal" value.
Conclusion
smem
does not know about zram
(yet) maybe because it is still staging and thus not part of /proc/meminfo
which reports global parameters (like (in)active pages size, total memory) and then only report on a few specific parameters. smem
identified a few of this specific parameters as "cache", sum them up and compare that to total memory. Because of that zram
used memory gets counted in the noncache column.
Note: by the way, in modern kernel, meminfo
reports also the shared memory consumed. smem
does not take that yet into account, so even without zram
the output of smem
is to consider carefully esp. if you use application that make big use of shared memory.
References used:
Using smem
to show a total of all user memory, no swap, and not counting any shared memory twice:
sudo smem -c pss -t | tail -1
Output on my system:
4119846
Unrolling that:
-c pss
selects the column, in this case PSS. From man smem
:
smem reports physical memory usage, taking shared memory pages
into account. Unshared memory is reported as the USS (Unique
Set Size). Shared memory is divided evenly among the processes
sharing that memory. The unshared memory (USS) plus a
process's proportion of shared memory is reported as the PSS
(Proportional Set Size). The USS and PSS only include physical
memory usage. They do not include memory that has been swapped
out to disk.
-t
shows a total or sum of all PSS used at the end, and tail -1
nips off the preceding data.
To show just the total unshared user memory, replace -c pss
with -c uss
:
sudo smem -c uss -t | tail -1
Output:
3874356
Note the above PSS total is more or less the same number as shown in row #5, column #2 here:
smem -w
Output:
Area Used Cache Noncache
firmware/hardware 0 0 0
kernel image 0 0 0
kernel dynamic memory 1367712 1115708 252004
userspace memory 4112112 419884 3692228
free memory 570060 570060 0
Best Answer
The difference you are observing isn't actually due to swap space being unaccounted for. The "(deleted)" that the kernel sometimes appends to
/proc/*/exe
links is output byreadlink
and is causing parse errors in your awk script, and you are effectively not counting processes whose binaries are no longer present in your total.Some kernels append the word "(deleted)" to
/proc/*/exe
symlink targets when the original executable for the process is no longer around.The reason your command is showing less than the total is because of this. The output of
readlink
on such links will be something like "/path/to/bin (deleted)", which causes a parse error inawk
when the output is substituted back into the string (it doesn't like the parentheses and spaces). For example, do this:And you will see a few entries with "(deleted)" appended. If you looked at the swap usage for these entries, their total would match the discrepancy you see, as the resulting
awk
errors prevent their totals from being calculated and included in the final total.If you run your original command without redirecting stderr anywhere, you will probably notice a few "runaway string constant" errors. Those errors are a result of the above and you should not have ignored them.
Ignoring other potential improvements to your original command, you could modify it by removing the " (deleted)", like this (note
|awk '{print $1}'
added toreadlink
output):This use of
awk
to fix the output ofreadlink
may break if the name contains spaces -- you can usesed
or whatever method you prefer.Bonus Info
By the way, you could just use
smem -t
. The "Swap" column displays what you want.As for calculating it yourself, though, you can also get this information more directly from the
VmSwap
field in/proc/*/status
(smaps requires some kernel support and isn't always available), and avoid having to redirect error output by using a proper filename pattern that avoids the errors to begin with:If you don't need the actual binary and can deal with just having the process name, you can get everything from
status
:And finally, if just having the PIDs suffices, you can just do it all with
awk
:Note:
Now this isn't to say that there aren't differences between
free
andsmem
(the latter being the same as your script). There are plenty (see, for example, https://www.google.com/search?q=smem+free, which has more than enough results on the first page to answer your questions about memory usage). But without a proper test, your specific situation cannot be addressed.