Ubuntu – How to count number of partial occurrences of a string in a file

bashcommand line

I have a file of which I need to count all partial matches for an input string in a file.
I'll show you an easy example of what I need:

In a file with this content:

Good-Black-Cat
Bad-Red-Cat
Bad-Gray-Dog
Good-Golden-Dog
Bad-White-Dog
Good-Tabby-Cat
Bad-Siamese-Cat

I need to count how many times does the partial string "Good -*-Cat" (Where * could be anything, it doesn't matter) appears. The expected output count is 2.

Any help will be appreciated.

Best Answer

Given

$ cat file
Good-Black-Cat
Bad-Red-Cat
Bad-Gray-Dog
Good-Golden-Dog
Bad-White-Dog
Good-Tabby-Cat
Bad-Siamese-Cat

then

$ grep -c 'Good-.*-Cat' file
2

Note that this is a count of matching lines - so for example it won't work for multiple occurrences per line, or for occurrences that span lines.

Alternatively, with awk

awk '/Good-.*-Cat/ {n++} END {print n}' file

If you need to match multiple possible occurrences per line, then I'd suggest perl:

perl -lne '$c += () = /Good-.*?-Cat/g }{ print $c' file

where /Good-.*?-Cat/g matches multiple times (g) and non-greedily* (.*?) and the () = assignment forces the matches to be evaluated in a scalar context so we can add them to the count.

Alternatively, you could use grep in perl-comparible regular expression (PCRE) mode (so as to enable the non-greedy modifier), with -o to output only the matching portions - then count those with wc:

grep -Po 'Good-.*?-Cat' file | wc -l

If you also need to match occurrences that may span a line boundary, then you can do so in perl by unsetting the record separator (note: this means that that the whole file is slurped into memory) and adding the s regex modifier e.g.

perl -0777 -nE '$c += () = /Good-.*?-Cat/gs }{ say $c' file

Related Solutions

Ubuntu – How to count occurrences of each character

You could use this:

sed 's/./&\n/g' 1.txt | sort | uniq -ic
  4  
  5 a
  1 c
  1 k
  1 M
  1 n
  5 o
  2 s
  4 t
  2 w
  1 y

The sed part places a newline after every character. Then we sort the ouput alphabetically. And at last uniq counts the number of occurences. The -i flag of uniq can be ommited if you don't want case insensitivity.

How to Output Web Page HTML Source Code to a File

Why can't you use curl?

curl web-address > file-source.

will output the source code in the file

Like this

curl http://askubuntu.com/questions/822139/how-to-output-web-page-html-source-code-into-a-file > source-html

Best Answer

Related Solutions

Ubuntu – How to count occurrences of each character

How to Output Web Page HTML Source Code to a File

Related Question