Is the sum of all PIDs “utime” the total system utime

cpuprocprocess

In order to measure a user's total CPU time, I'm using the "utime" field out of /proc/[pid]/stat:

utime %lu   Amount of time that this process has been scheduled in user
            mode, measured in clock ticks (divide by
            sysconf(_SC_CLK_TCK).  This includes guest time, guest_time
            (time spent running a virtual CPU, see below), so that
            applications that are not aware of the guest time field do
            not lose that time from their calculations.

(from man proc (5))

So, my "user utime" is the sum of the utime of all PIDs this user is running.

I'm hoping this will give me an accurate value for the number of CPU seconds this user has spent. Am I on the right track?

Some of the things I don't understand or take into account yet:

  • Each PID also has a parent PID (or zero). But I'm counting every PID, not just the ones with a ppid of 0. Is this correct?
  • There are, in addition to utime, stime, cutime and cstime. Do I need to worry about those? I'm assuming that utime is the total number of cpu seconds for a PID, not counting the parent.

If I calculate the system's total cpu time using /proc/uptime, this value is fairly close to my sum for all users, but the difference is significant. For instance (in minutes):

system cpu_time:         96.13
sum of users_cputime:   111.45

Correction:

I get "sensible looking" values for all kinds of things. At the moment I'm using the sum of utime, stime, cutime and cstime. And it reports values that, while I don't understand them, correlate very well with measurements from time.

If I am completely on the wrong track, there's another question:

Best Answer

The traditional way to log and track user CPU time is process accounting. On Linux, install the GNU accounting utilities, typically provided by a package called acct. I'm not sure how accurate it will be at keeping track of the time spent in very short-lived processes, but it'll at least list all the processes ever executed.

Run lastcomm to get a list of all the commands executed by any user and the time spent in each (rounded to ~10ms for short-lived processes, expect to see a lot of 0.00). Run sa to display various sums and statistics. In particular, sa -m displays per-user totals. The statistics accumulated by sa run from the last rotation of the accounting logs (typically located in /var/log/account/).

Note that you aren't going to catch all processes by sampling at intervals, not by a far cry. You'll miss almost all short-lived processes and the last few seconds of long processes. Process accounting does list all past processes.

In /proc/$pid/stat, the user time is the time spent doing computation, as opposed to the system time spent doing I/O. Which one to count depends on what you want to do with the information.

Counting all the PIDs is right. I don't know what the parent PID has to do with this.

On the system side, your description of /proc/uptime seems wrong. Wikipedia has it right as I write. The first field is the real time elapsed since the system booted, minus any time spent suspended or hibernating. The second field is the cumulated time spent in the idle task on all CPUs. I'm not sure what that really means; it's certainly not the total idle time on my machine. In the kernel, the value is summed in uptime_proc_show from variables updated in account_idle_time.

Related Question