Awk – Group by and sum column values

awkpipeps

I have command to list system process by memory usage:

ps -A --sort -rss -o comm,pmem

Which list a table like

COMMAND         %MEM
firefox         28.2
chrome           5.4
compiz           4.8
atom             2.5
chrome           2.3
Xorg             2.3
skype            2.2
chrome           2.0
chrome           1.9
atom             1.9
nautilus         1.8
hud-service      1.5
evince           1.3

I would like to get total memory share per programs instead of per process of same programs. So I could get output like

COMMAND         %MEM
firefox         28.2
chrome          11.6
compiz           4.8
atom             4.4
Xorg             2.3
skype            2.2
nautilus         1.8
hud-service      1.5
evince           1.3

I thought about using awk, which I don't know much. Ended up with something like:

ps -A --sort -rss -o comm,pmem | awk -F "\t" '
{processes[$0] += $1;}
{End
for(i in processes) {
  print i,"\t",processes[i];
}
}'

But it didn't work.

How can I correct this?

Best Answer

processes[$0] += $1; use the whole line as key in your associative array, which is not unique. You must use $1, which is command name as key.

Try:

$ ps -A --sort -rss -o comm,pmem | awk '
  NR == 1 { print; next }
  { a[$1] += $2 }
  END {
    for (i in a) {
      printf "%-15s\t%s\n", i, a[i];
    }
  }
'

If you want to sort the output by the second field, try:

$ ps -A --sort -rss -o comm,pmem | awk '
  NR == 1 { print; next }
  { a[$1] += $2 }
  END {
    for (i in a) {
      printf "%-15s\t%s\n", i, a[i] | "sort -rnk2";
    }
  }
'
Related Question