Awk comparison using arrays

awktext processing

I have the following file:

6180,6180,0,1,,1,0,1,1,0,0,0,0,0,0,0,0,4326,4326,,0.440000,
6553,6553,0,1,,1,0,1,1,0,0,0,0,1,0,1,0,4326,4326,,9.000000,
1297,1297,0,0,,0,0,1,0,0,0,0,0,1,0,1,0,1707,1707,,7.000000,
6598,6598,0,1,,1,0,1,1,0,0,0,1,0,0,0,0,1390,1390,,0.730000,
4673,4673,0,1,,1,0,1,1,0,0,0,0,0,0,0,0,1707,1707,,0.000000,

I need an awk command that print out the maximum value of $21 for $18.

the desired output will look like:

6553,6553,0,1,,1,0,1,1,0,0,0,0,1,0,1,0,4326,4326,,9.000000,
1297,1297,0,0,,0,0,1,0,0,0,0,0,1,0,1,0,1707,1707,,7.000000,
6598,6598,0,1,,1,0,1,1,0,0,0,1,0,0,0,0,1390,1390,,0.730000,

I got this result, but using the sort command, as below:

sort -t, -k18,18n -k21,21nr | awk -F"," '!a[$18]++'

while I am looking to do it with single awk command.

Please advice,

Best Answer

I don't see why you would want to do it in a single awk command, what you have seems perfectly fine. Anyway, here's one way:

$ awk -F, '(max[$18]<$21 || max[$18]==""){max[$18]=$21;line[$18]=$0}
            END{for(key in line){print line[key]}}' file
6598,6598,0,1,,1,0,1,1,0,0,0,1,0,0,0,0,1390,1390,,0.730000,
1297,1297,0,0,,0,0,1,0,0,0,0,0,1,0,1,0,1707,1707,,7.000000,
6553,6553,0,1,,1,0,1,1,0,0,0,0,1,0,1,0,4326,4326,,9.000000,

The idea is very simple. We have two arrays, max has $18 as a key and $21 as a value. For every line, if the saved value for $18 is smaller than $21 or if there is no value stored for $18, then we store the current line ($0) as the value for $18 in array line. Finally, in the END{} block, we print array line.

Note that the script above treats $18 as a string. Therefore, 001 and 1 will be considered different strings.

Related Question