AWK Text Processing – Count Rows with String Occurrences in Multiple Columns

awktext processing

I have several hundred text files consisting each of five tab delimited columns. The first column contains an index and the following four the count of occurrences. Now I would like to count the number of rows that contain 3 columns with 0 (i.e. 7 rows in the example below).

1   0   0   0   9
2   0   9   0   0
3   10  0   0   0
4   0   10  4   0
5   0   0   0   10
6   0   0   0   10
7   0   0   0   10
8   0   10  0   0
9   5   0   5   0

I can code this as a loop in R, but as the original files each contain 60+ million rows, I wonder if there is no workaround with awk or sed and wc -l.

Best Answer

Using GNU sed:

sed -E 's/\t0\>/&/3;t;d' file  | wc -l

As pointed out by Isaac, if we want to count exact 3 then do this :

sed -n 's/\t0\>//4;t;s//&/3p' file | wc -l
Related Question