Linux – Tracking down “missing” memory usage in linux

linuxmemorymemory leaks

On an Arch 3.6.7 x86_64 kernel I am trying to account for the memory usage of the system, which the more I look at it, the more there appears to be a hole (in the accounting of used memory, a non-hole in the usage of).

This is a freshly booted system. With not much running other than systemd and sshd to keep it simple

$ ps aux | sort -n -k6
...
root       316  0.0  0.0   7884   812 tty1     Ss+  14:37   0:00 /sbin/agetty --noclear tty1 38400
matt       682  0.0  0.0  24528   820 pts/0    S+   15:09   0:00 sort -n -k6
dbus       309  0.0  0.0  17280  1284 ?        Ss   14:37   0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
matt       681  0.0  0.0  10808  1364 pts/0    R+   15:09   0:00 ps aux
root       308  0.0  0.0  26060  1516 ?        Ss   14:37   0:00 /usr/lib/systemd/systemd-logind
root       148  0.0  0.0  25972  1692 ?        Ss   14:37   0:00 /usr/lib/systemd/systemd-udevd
matt       451  0.0  0.0  78180  2008 ?        S    14:37   0:00 sshd: matt@pts/0
root       288  0.0  0.0  39612  2708 ?        Ss   14:37   0:00 /usr/sbin/sshd -D
matt       452  0.0  0.0  16452  3248 pts/0    Ss   14:37   0:00 -bash
root         1  0.0  0.0  32572  3268 ?        Ss   14:37   0:00 /sbin/init
root       299  0.0  0.0  69352  3604 ?        Ss   14:37   0:00 /usr/sbin/syslog-ng -F
root       449  0.0  0.0  78040  3800 ?        Ss   14:37   0:00 sshd: matt [priv]
root       161  0.0  0.0 358384  9656 ?        Ss   14:37   0:00 /usr/lib/systemd/systemd-journald

The most detailed memory info I can find is this from 2007 which appears to have resulted in the addition of the Pss field to general kernel accounting for a process but their python code is for older kernels and unfortunately some of the /proc/k* files have disappeared since then. The /proc/meminfo documentation is also helpful but aging a bit too.

So, a demonstration of what I'm seeing.

# cat /proc/meminfo
MemTotal:       16345780 kB
MemFree:        16129940 kB
Buffers:           10360 kB
Cached:            48444 kB
SwapCached:            0 kB
Active:            24108 kB
Inactive:          46724 kB
Active(anon):      12104 kB
Inactive(anon):     3616 kB
Active(file):      12004 kB
Inactive(file):    43108 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:         11996 kB
Mapped:            16372 kB
Shmem:              3696 kB
Slab:              25092 kB
SReclaimable:      11716 kB
SUnreclaim:        13376 kB
KernelStack:         928 kB
PageTables:         2428 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     8172888 kB
Committed_AS:      34304 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      372788 kB
VmallocChunk:   34359362043 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       12288 kB
DirectMap2M:    16680960 kB

If we add up the used:

MemTotal - MemFree - Buffers - Cached = Used
16345780 - 16129940 - 10360 - 48444 = 157036

All the Active*/Inactive* seem to be counters applied over some pages (not all) so could duplicate what is counted elsewhere.

Active + Inactive = Used
46724  + 24108    = 70832 (not quite)

Commited_AS here seems to track closely to the sum of userspace private/shared memory discounting shared files from /proc/*/smaps. taking PSS into account also lines up. (Out of interest I get a much, much larger Commited_AS on a 32bit debian 2.6.32-5-686)

AnonPages + Mapped + Commited_AS = Userspace?
11996     + 16372  + 34304       = 62672

Slab is inline with /proc/slabinfo

Slab +  Shmem + KernelStack + PageTables = Kernelspace?
25092 + 3696  + 928         + 2428       = 32144

Userspace? + Kernelspace? = Used?
62672      + 32144        = 94816

So ~ 63M short. It strikes me that the kernel and all the modules loaded are lacking some MB's. The slab seems to cover a lot though, so if there is any missing I'm not sure if that would equate to ~ 60Mb?

63 is kinda close to the Active+Inactive figure but that doesn't feel right.

So does anyone know the magic formula?? Otherwise if the figure's I am looking at are the right ones what are the grey areas in the memory allocation that I can poke about in?

It appears linux ate my ram! Albeit a smaller portion than normally accused of =)

edit Commited_AS is a guesstimate from the kernel of how much memory it would need to cover 99.9% of what it has committed, so isn't a real allocated number. AnonPages+Mapped is a component of it, so that leaves a larger hole, about 100MB now.

User + Kernel
28368 + 32144 = 60512 != 157036

AnonPages and Mapped mostly track with anon/mapped info from /proc/[0-9]*/smaps wgen taking PSS/Shared into account.

The reserved areas appear to all fit into the chunk taken off total memory:

Total free memory is 16345032Kb
Total system memory is 16777216Kb
PCI 'hole' – lspci -v 266520K = 16510696K
Bios Reserved – dmesg 92793K = 16417903K

edit2
I noticed this extra memory usage wasn't on the VM's running inside the original box the /proc/meminfo was from. So I started poking around seeing what was different between the two. Eventually found that an increase in total physical memory available coincided with the increase in the used memory.

phys 16GB used>144508     vm>50692      user>21500      kern>26428      u+ktot>47928
vm   64MB used>24612      vm>31140      user>14956      kern>14440      u+ktot>29396
vm  256MB used>26316      vm>35260      user>14752      kern>14780      u+ktot>29532
vm    1GB used>33644      vm>35224      user>14936      kern>14772      u+ktot>29708
vm    2GB used>41592      vm>35048      user>14736      kern>15056      u+ktot>29792
vm    4GB used>57820      vm>35232      user>14780      kern>14952      u+ktot>29732
vm    8GB used>82932      vm>36912      user>15700      kern>15388      u+ktot>31088
vm   12GB used>110072     vm>35248      user>14812      kern>15624      u+ktot>30436
vm   15GB used>122012     vm>35424      user>14832      kern>15824      u+ktot>30656

That works out to be ~ 8Mb allocated for every 1GB of memory. Might be a memory map in the kernel… but I thought that would only grow as memory is allocated rather than be setup at boot.

Would be interesting to see if anyone has access to any bigmem machines if the trend continues?

Best Answer

The "memory used by a process" is not a clear cut concept in modern operating systems. What can be measured is the size of the address space of the process (SIZE) and resident set size (RSS, how many of the pages in the address space are currently in memory). Part of RSS is shared (most processes in memory share one copy of glibc, and so for assorted other shared libraries; several processes running the same executable share it, processes forked share read-only data and possibly a chunk of not-yet-modified read-write data with the parent). On the other hand, memory used for the process by the kernel isn't accounted for, like page tables, kernel buffers, and kernel stack. In the overall picture you have to account for the memory reserved for the graphics card, the kernel's use, and assorted "holes" reserved for DOS and other prehistoric systems (that isn't much, anyway).

The only way of getting an overall picture is what the kernel reports as such. Adding up numbers with unknown overlaps and unknown left outs is a nice exercise in arithmetic, nothing more.

Related Question