Count text occurrences per line

greptext processing

I have to parse huge text files where certain lines are of interest and others are not. Within those of interest I have to count the occurrences of a certain keyword.

Assumed the file is called input.txt and it looks like this:

format300,format250,format300
format250,ignore,format160,format300,format300
format250,format250,format300

I want to exclude the lines with ignore and count the number of format300, how do I do that?

What I've got so far is this command which only counts ONCE PER LINE (which is not yet good enough):

cat input.txt | grep -v ignore | grep 'format300' | wc -l

Any suggestions? If possible I want to avoid using perl.

Best Answer

You don't need the first cat, that it is known as a Useless use of cat (UUOC).

Also, very useful is grep -o, that only outputs the matching patterns, one per line.

And then, count lines with wc -l.

grep -v ignore YOUR_FILE | grep -o format300 | wc -l

This prints 3 for your small sample.

Related Question