How to print incremental count of occurrences of unique values in column 1

awkperl

I'm trying to come up with an solution to this problem, I need to incrementally count and then print the counts of the unique values in column 1 of a tab delimited text file. Here is an example:

Apple_1   1      300
Apple_2   1      500
Apple_2   500    1500
Apple_2   1500   2450
Apple_3   1      1250
Apple_3   1250   2000

And the desired output is:

Apple_1   1      300     1
Apple_2   1      500     1
Apple_2   500    1500    2
Apple_2   1500   2450    3
Apple_3   1      1250    1
Apple_3   1250   2000    2

I know that I can print the line number in awk with just print NR, but I don't know how to reset it for each unique value of column 1.

Thanks for any help you can offer, I appreciate it.

Best Answer

The standard trick for this kind of problem in Awk is to use an associative counter array:

awk '{ print $0 "\t" ++count[$1] }'

This counts the number of times the first word in each line has been seen. It's not quite what you're asking for, since

Apple_1   1      300
Apple_2   1      500
Apple_1   500    1500

would produce

Apple_1   1      300     1
Apple_2   1      500     1
Apple_1   500    1500    2

(the count for Apple_1 isn't reset when we see Apple_2), but if the input is sorted you'll be OK.

Otherwise you'd need to track a counter and last-seen key:

awk '{ if (word == $1) { counter++ } else { counter = 1; word = $1 }; print $0 "\t" counter }'
Related Question