Load Average in Top Output – Interpretation Across Distributions

load-averagerheltop

I would like to know if the output of a Red-Hat based linux could be differently interpreted by a Debian based linux.

To make the question even more specific, what I am after, is understanding how the "load average" from the first line of the top command on a Red-Hat system is interpreted and how to verify this by official documentation ro code.

[There are many ways to approach this subject, all of which are acceptable answers to the question]

One potential approach, would be to find where this information is officially documented.
Another one, would be to find the code version that top is built from in the specific distribution and version I am working on.

The command output I am getting is:

    top - 13:08:34 up  1:19,  2 users,  load average: 0.02, 0.00, 0.00
    Tasks: 183 total,   1 running, 182 sleeping,   0 stopped,   0 zombie
    Cpu(s):  0.2%us,  0.2%sy,  0.0%ni, 96.8%id,  2.7%wa,  0.0%hi,  0.0%si,  0.0%st
    Mem:   3922520k total,   788956k used,  3133564k free,   120720k buffers
    Swap:  2097148k total,        0k used,  2097148k free,   344216k cached

In this case how can I interpret the load average value?

I have managed to locate that the average load is about the last minute, from one documentation source and that it should be interpreted after being multiplied with 100, by another documentation source.
So, the question is:

Is it 0.02% or 2% loaded?

Documentation sources and versions:

  1. The first one stars with

     TOP(1)                        Linux User’s Manual                       TOP(1)
    
     NAME
            top - display Linux tasks
    

    Source: man top in my RedHat distribution
    Ubuntu also has the version with "tasks" that does not explain the load average in:
    http://manpages.ubuntu.com/manpages/precise/man1/top.1.html

  2. The second one starts with

     TOP(1)                          User Commands                         TOP(1)
    
    NAME         top
    
    top - display Linux processes
    

    Source:
    http://man7.org/linux/man-pages/man1/top.1.htm

  3. This one starts with:

    TOP(1)
    
    NAME
    
    top - display and update information about the top cpu processes
    

    Source: http://www.unixtop.org/man.shtml

The first one, can be seen by man top in RHEL or in online ubuntu documentation and it does not have any explanation for the output format (nor about the load average in which I am interested in).

The second one, contains a brief explanation, pointing out that the load average has to do with the last 1 minute, but nothing about the interpretation of its value!

I quote directly from the second source:

2a. UPTIME and LOAD Averages
This portion consists of a single line containing:
program or window name, depending on display mode
current time and length of time since last boot
total number of users
system load avg over the last 1, 5 and 15 minutes

So, if this explanation is indeed correct, it is just enough to understand that the load average is about the last 1 minute.
But it does not explain the format of the number.

In the third explanation, it says that:

When specifying numbers for load averages, they should be multiplied by 100.

This explanation suggests that 0.02 means 2% and not 0.02%. But is this correct? Additionally, is it correct for all distributions of linux and potentially different implementations of top?
To find the answer to this question, I tried to go through the code by searching it online. But I found, at least, two different version of top related to RHEL out there! the builtin-top.c and the refactored top.c. Both copyrighted by Red-Hat as the notice says in the beginning of the code and thus seems logical that RHEL uses one of these.
http://lxr.free-electrons.com/source/tools/perf/builtin-top.c
http://lxr.free-electrons.com/source/tools/perf/util/top.c

So, before delving into that much code, I wanted an opinion about where to focus to form an accurate understanding on how cpu load is interpreted?

From information given in the answers below, in addition to some personal search, I have found that:

  1. The top that I am using is contained in the package procps-3.2.8. Which can be verified by using top -v.
  2. In the version of procps-3.2.8 that I have downloaded from the official website it seems that the tool uptime get its information from the procfs file /proc/loadavg directly (not utilizing the linux function getloadavg()).
  3. Now for the top command it also does not use the function getloadavg(). I managed to verify that the top does indeed the same things as the uptime tool to show the load averages. It actually calls the uptime tool's function, which gets its information from the procfs file /proc/loadavg.

    So, everything points to the /proc/loadavg file! Thus, to form an accurate understanding of the load average produced by top, one must read the kernel code to see how the file loadavg is written.

There is also an excellent article pointed out in one of the answers that provides a layman's terms explanation of the three values of loadavg.
So, despite the fact that all answers have been equally useful and helpful, I am going to mark the one that pointed to the article
http://www.linuxjournal.com//article/9001 as "the" answer to my question.Thank you all for your contribution!

Additionally from the question Understanding top and load average, I have found a link to the source code of the kernel that points to the spot where loadavg is calculated. As it seems there is a huge comment explaining the way it works, also this part of the code is in C!
The link to the code is http://lxr.free-electrons.com/source/kernel/sched/loadavg.c
Again I am not trying to engage in any form of plagiarism, I am just adding this for completeness. So, I am repeating that the link to the kernel code was found from one of the answers in Understanding top and load average.

Best Answer

The CPU load is the length of the run queue, i.e. the length of the queue of processes waiting to be run.

The uptime command may be used to see the average length of the run queue over the last minute, the last five minutes, and the last 15 minutes, just like what's usually displayed by top.

A high load value means the run queue is long. A low value means that it is short. So, if the one minute load average is 0.05, it means that on average during that minute, there was 0.05 processes waiting to run in the run queue. It is not a percentage. This is, AFAIK, the same on all Unices (although some Unices may not count processes waiting for I/O, which I think Linux does; OpenBSD, for a while only, also counted kernel threads, so that the load was always 1 or more).

The Linux top utility gets the load values from the kernel, which writes them to /proc/loadavg. Looking at the sources for procps-3.2.8, we see that:

  1. To display the load averages, the sprint_uptime() function is called in top.c.
  2. This function lives in proc/whattime.c and calls loadavg() in proc/sysinfo.c.
  3. That function simply opens LOADAVG_FILE to read the load averages.
  4. LOADAVG_FILE is defined earlier as "/proc/loadavg".
Related Question