Resident Set Size vs Virtual Size – Explanation of Memory Metrics

linuxmemoryprocess

I found that pidstat would be a good tool to monitor processes. I want to calculate the average memory usage of a particular process. Here is some example output:

02:34:36 PM       PID  minflt/s  majflt/s     VSZ    RSS   %MEM  Command
02:34:37 PM      7276      2.00      0.00  349212 210176   7.14  scalpel

(This is part of the output from pidstat -r -p 7276.)

Should I use the Resident Set Size (RSS) or Virtual Size (VSZ) information to calculate the average memory consumption? I have read a few thing on Wikipedia and on forums but I am not sure to fully understand the differences. Plus, it seems that none of them are reliable. So, how can I monitor a process to get its memory usage?

Any help on this matter would be useful.

Best Answer

RSS is how much memory this process currently has in main memory (RAM). VSZ is how much virtual memory the process has in total. This includes all types of memory, both in RAM and swapped out. These numbers can get skewed because they also include shared libraries and other types of memory. You can have five hundred instances of bash running, and the total size of their memory footprint won't be the sum of their RSS or VSZ values.

If you need to get a more detailed idea about the memory footprint of a process, you have some options. You can go through /proc/$PID/map and weed out the stuff you don't like. If it's shared libraries, the calculation could get complex depending on your needs (which I think I remember).

If you only care about the heap size of the process, you can always just parse the [heap] entry in the map file. The size the kernel has allocated for the process heap may or may not reflect the exact number of bytes the process has asked to be allocated. There are minute details, kernel internals and optimisations which can throw this off. In an ideal world, it'll be as much as your process needs, rounded up to the nearest multiple of the system page size (getconf PAGESIZE will tell you what it is — on PCs, it's probably 4,096 bytes).

If you want to see how much memory a process has allocated, one of the best ways is to forgo the kernel-side metrics. Instead, you instrument the C library's heap memory (de)allocation functions with the LD_PRELOAD mechanism. Personally, I slightly abuse valgrind to get information about this sort of thing. (Note that applying the instrumentation will require restarting the process.)

Please note, since you may also be benchmarking runtimes, that valgrind will make your programs very slightly slower (but probably within your tolerances).