Does /proc/[PID]/stat display cumulative CPU stats about child processes

proc

I'm not sure to express this question correctly and I've try to read the man proc, but I can't manage to find a proper answer and I couldn't find a simple way to verify it.
I try to get CPU/RAM consumption for one PID but I don't know how many child processes will be cast by the program and I want to get the total amount of CPU and RAM consumption not just the main process.
I know for fact and experience that /proc/[PID]/io is indeed cumulative through all child processes, but I would like to know and if possible with proof if the same applies to /proc/[PID]/stat.

Best Answer

Conlusion:
If you don't want to read the whole explanation just read this:
Yes the value contained in /proc/[PID]/stat allows to determine the amount of CPU time used by a process and its children.
However, you can't use it for real time monitoring because value for children CPU time is updated only when child process die.

Explanation:
According to the man time time returns the following stats :

These statistics consist of (i) the elapsed real time between invocation and termination, (ii) the user CPU time (the sum of the tms_utime and tms_cutime values in a struct tms as returned by times(2)), and (iii) the system CPU time (the sum of the tms_stime and tms_cstime values in a struct tms as returned by times(2)).

If one reads man times one could learn that the structure is define as :

struct tms {
   clock_t tms_utime;  /* user time */
   clock_t tms_stime;  /* system time */
   clock_t tms_cutime; /* user time of children */
   clock_t tms_cstime; /* system time of children */
};

Which means that this command returns the cumulated user and system CPU time from the process and all it's children.
Now we need to know what we can extract from /proc. In the man proc in section /proc/[PID]/stat you can extract the following informations:

(14) utime %lu
Amount of time that this process has been scheduled in user mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK)). This includes guest time, guest_time (time spent running a virtual CPU, see below), so that applications that are not aware of the guest time field do not lose that time from their calculations.
(15) stime %lu
Amount of time that this process has been scheduled in kernel mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK)).
(16) cutime %ld
Amount of time that this process's waited-for children have been scheduled in user mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK)). (See also times(2).) This includes guest time, cguest_time (time spent running a virtual CPU, see below).
(17) cstime %ld
Amount of time that this process's waited-for children have been scheduled in kernel mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK)).

So basically this /proc/[PID]/stat file contains the value use by time to determine CPU time in seconds

Strong of that knowledge I tried to run my script like this time load.sh and I add the end of the script cat /proc/$$/stat Here's the results:

9398 (load.sh) S 5379 9398 5379 34817 9398 4194304 1325449 7562836 0 0 192 520 3964 1165 20 0 1 0 814903 14422016 1154 18446744073709551615 4194304 5242124 140726473818336 0 0 0 65536 4 65538 1 0 0 17 3 0 0 818155 0 0 7341384 7388228 9928704 140726473827029 140726473827049 140726473827049 140726473830382 0  

output of the time command:

real    0m38,783s
user    0m41,576s
sys     0m16,866s

According to man proc we need to look at the columns 14,15,16 and 17: 192 520 3964 1165 so if we sum up time spent in user/system cpu by process and its children.

192+3964 = 4156  <=>  user 0m41,576s
520+1165 = 1685  <=>  sys  0m16,866s

Et voilà, the CPU time is not exactly cumulative but you can calculate pretty accuratly (centisecond) the CPU time use by your program and it's children using /proc/[PID]/stat.

EDIT:
After futher testing and talk wqith people, I finally get an answer, I've run a script that simply contains:

#!/bin/bash
sleep 5
time stress --cpu 4 -t 60s --vm-hang 15
sleep 5
cat /proc/$$/stat | cut -d ' ' -f 14-17
exit

And using watch to monitor the metric in /proc/$$/stat at the same time. As long as the child process is not finished the counter are not updated. When stress ends then the value displayed in /proc/$$/stat are updated and ends with similar result between time command and the column 14 to 17 of /proc.

Old edit I though it was over but after doing some more research I tried the same with the command stress

time stress --cpu 4 -t 60s  
stress: info: [18598] dispatching hogs: 4 cpu, 0 io, 0 vm, 0 hdd
stress: info: [18598] successful run completed in 60s
real    1m0,003s
user    3m53,663s
sys     0m0,349s

During the execution I watch 2 times/second the result of the command:

cat /proc/11223/stat | cut -d ' ' -f 14-17
0 0 0 0

While ps faux | grep stress would give me this particular PID as father of the four stress thread.

Related Question