VSZ
(or VIRT
, depending on the version of top
) is the amount of memory mapped into the address space of the process. It includes pages backed by the process' executable file and shared libraries, its heap and stack, as well as anything else it has mapped.
In the case of the sample output you show, the virtual size is larger than the amount of physical memory on the system, so necessarily some (most!) of the pages in the process' address space aren't physically present in RAM. That's not a problem: many programs contain large amounts of code and maps lots of shared libraries but they only actually use certain portions of that code, or at least only use certain portions of the code at the same time, which allows the kernel to drop the unused portions from memory whenever they are not used, or even to never load them in the first place.
Your version of top
doesn't seem to show a RES
column, which would tell you how much of the memory in the process' address space is currently resident in RAM.
Conlusion:
If you don't want to read the whole explanation just read this:
Yes the value contained in /proc/[PID]/stat allows to determine the amount of CPU time used by a process and its children.
However, you can't use it for real time monitoring because value for children CPU time is updated only when child process die.
Explanation:
According to the man time
time returns the following stats :
These statistics consist of (i) the elapsed real time between invocation and termination, (ii) the user CPU time (the sum of the tms_utime and tms_cutime values in a struct tms as returned by times(2)), and (iii) the system CPU time (the sum of the tms_stime and tms_cstime values in a struct tms as returned by times(2)).
If one reads man times
one could learn that the structure is define as :
struct tms {
clock_t tms_utime; /* user time */
clock_t tms_stime; /* system time */
clock_t tms_cutime; /* user time of children */
clock_t tms_cstime; /* system time of children */
};
Which means that this command returns the cumulated user and system CPU time from the process and all it's children.
Now we need to know what we can extract from /proc
. In the man proc
in section /proc/[PID]/stat
you can extract the following informations:
(14) utime %lu
Amount of time that this process has been scheduled in user mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK)). This includes guest time, guest_time (time spent running a virtual CPU, see below), so that applications that are not aware of the guest time field do not lose that time from their calculations.
(15) stime %lu
Amount of time that this process has been scheduled in kernel mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK)).
(16) cutime %ld
Amount of time that this process's waited-for children have been scheduled in user mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK)). (See also times(2).) This includes guest time, cguest_time (time spent running a virtual CPU, see below).
(17) cstime %ld
Amount of time that this process's waited-for children have been scheduled in kernel mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK)).
So basically this /proc/[PID]/stat
file contains the value use by time to determine CPU time in seconds
Strong of that knowledge I tried to run my script like this time load.sh
and I add the end of the script cat /proc/$$/stat
Here's the results:
9398 (load.sh) S 5379 9398 5379 34817 9398 4194304 1325449 7562836 0 0 192 520 3964 1165 20 0 1 0 814903 14422016 1154 18446744073709551615 4194304 5242124 140726473818336 0 0 0 65536 4 65538 1 0 0 17 3 0 0 818155 0 0 7341384 7388228 9928704 140726473827029 140726473827049 140726473827049 140726473830382 0
output of the time
command:
real 0m38,783s
user 0m41,576s
sys 0m16,866s
According to man proc
we need to look at the columns 14,15,16 and 17: 192 520 3964 1165
so if we sum up time spent in user/system cpu by process and its children.
192+3964 = 4156 <=> user 0m41,576s
520+1165 = 1685 <=> sys 0m16,866s
Et voilà, the CPU time is not exactly cumulative but you can calculate pretty accuratly (centisecond) the CPU time use by your program and it's children using /proc/[PID]/stat
.
EDIT:
After futher testing and talk wqith people, I finally get an answer, I've run a script that simply contains:
#!/bin/bash
sleep 5
time stress --cpu 4 -t 60s --vm-hang 15
sleep 5
cat /proc/$$/stat | cut -d ' ' -f 14-17
exit
And using watch to monitor the metric in /proc/$$/stat
at the same time. As long as the child process is not finished the counter are not updated. When stress
ends then the value displayed in /proc/$$/stat
are updated and ends with similar result between time
command and the column 14 to 17 of /proc
.
Old edit
I though it was over but after doing some more research I tried the same with the command stress
time stress --cpu 4 -t 60s
stress: info: [18598] dispatching hogs: 4 cpu, 0 io, 0 vm, 0 hdd
stress: info: [18598] successful run completed in 60s
real 1m0,003s
user 3m53,663s
sys 0m0,349s
During the execution I watch 2 times/second the result of the command:
cat /proc/11223/stat | cut -d ' ' -f 14-17
0 0 0 0
While ps faux | grep stress
would give me this particular PID as father of the four stress
thread.
Best Answer
This is down to a Linux:
When a program starts another, it should use the name of the executable file as command line parameter $0, but it may choose to do otherwise. The
Name
field of/proc/PID/status
is always set to the name of the executable by the kernel (but truncated to 15 characters).The application itself can change a name. You can get the longer name from
/proc/PID/cmdline
(read up to the first null byte).