Linux – Discrepancy between reported used memory and sum of application memory usage

linuxmemorymemory leaksprocessvirtual-memory

I'm running a desktop system that quite regularly suffers from lack of memory, this prompted me to investigate what causes the issue in the first place.

Problem is, there's no single process that eats the memory, yet the system doesn't show it as available. What's more, the system does swap so it looks like the memory pressure is real. What's puzzling, is that the usage goes to normal (~1GB used) after I log out and back again so it looks like some weird interaction between userland and kernel and not a memory leak.

In short:

  • memory reported as used by free, excluding cache/buffers: 3173960 kB
  • sum of USS of all applications: 2413952 kB
  • SLAB size: 158968 kB
  • zram (after compression): 75992 kB

That gives, 3173960-2413952-158968-75992 = 525048 kB unaccounted memory usage.

What I'm missing or not counting?


Sum of applications memory usage:

# smem -t | sed -n '1p;$p'
  PID User     Command                         Swap      USS      PSS      RSS 
  108 6                                      244524  2413952  2461340  2648488

Memory usage as reported by free:

# free -k
             total       used       free     shared    buffers     cached
Mem:       4051956    3449748     602208          0      26548     249240
-/+ buffers/cache:    3173960     877996
Swap:      4051952     242592    3809360

General memory statistic:

# cat /proc/meminfo 
MemTotal:        4051956 kB
MemFree:          612260 kB
Buffers:           26636 kB
Cached:           249304 kB
SwapCached:       107892 kB
Active:          1774004 kB
Inactive:         885268 kB
Active(anon):    1712484 kB
Inactive(anon):   710788 kB
Active(file):      61520 kB
Inactive(file):   174480 kB
Unevictable:        9332 kB
Mlocked:            9332 kB
SwapTotal:       4051952 kB
SwapFree:        3809368 kB
Dirty:                40 kB
Writeback:             0 kB
AnonPages:       2343292 kB
Mapped:            95288 kB
Shmem:             36396 kB
Slab:             158968 kB
SReclaimable:      53900 kB
SUnreclaim:       105068 kB
KernelStack:        3528 kB
PageTables:        43600 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     6077928 kB
Committed_AS:    4013288 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      139852 kB
VmallocChunk:   34359570976 kB
HardwareCorrupted:     0 kB
AnonHugePages:    641024 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:     2310848 kB
DirectMap2M:     1882112 kB

Swaps are on zram:

# cat /proc/swaps 
Filename                                Type            Size    Used    Priority
/dev/zram0                              partition       2025976 121252  100
/dev/zram1                              partition       2025976 121324  100

# awk ' { print $0 / 1024; sum+=$0 } END { print "sum:" sum/1024 } ' /sys/block/zram*/compr_data_size
37962.4
38030.1
sum:75992.5

Best Answer

The problem

4 GB of RAM (physical memory) and that you have 2 zram device of maximum 2,025,976 kB (roughly 2 GB each). zram is using the available memory, I do not know exactly the internal but whatever the mechanism I can clearly imagine a scenario where Linux page out (= put some memory from the RAM to zram) to get some more free space but then the zram usage in memory is growing, so it would further page out, which would result in further increase of zram usage, and so on until zram is consuming all your physical memory.

I guess there is a threshold on any system under which the paging out won't stress the kernel to the point I describe above, so that zram improve performance.

Insights

When your system wants to swap 100 MB, what happens is that it puts this 100 MB in zram. Let's say it gets compressed to 50% less, so 50 MB. It means that your system wanted to free 100 MB but only 50 MB got freed. Now Linux is clever in that when it has paged out chunk of memory (so put them in the swap) but need them again, it can do some "optimisation", it can page in again this memory but keep it in the swap as well, so if quickly after it would need to page out these part of the memory it could avoid an expenive write to the swap file. So in your case, it could be that Linux keeps the 100 MB in zram and put them back in normal RAM, so the system consumes 150 MB for awhile. If this is repeated for bigger program with less compressible data, this could quickly become a nightmare, imagine a 300 MB chunk of RAM that would be paged out, and use 120 MB in each zram swap. It means that Linux wanted to free 300 MB of the RAM for other purpose, but has only freed (300-120-120=60) 60 MB, it might then try to page out further pages, and so on, with the problem that you have 2 zram that can use up to 2 GB of RAM each, thus eating all your memory.

Conclusion and solution

So is zram crap? No, not at all, the problem is that you configured zram to have a total size of exactly your physical RAM and that's the problem. You should not configure zram to use more than 25% IMHO of your physical RAM, which means you would have to rely still in a hard disk swap solution once zram swap is filled up.

A simple solution would be to reduce both zram to handle each 500 MB max and add a swap file of roughly 2-3 GB, to allow the kernel to free really unused pages from zram to this swap file. The swap file won't use the RAM and dimish the pressure on it.

Some information on how to set your zram disk size.