I need to extract (or count) the lines (in a file)
that have two or more dots. The lines should not start with dot
(it’s OK if they end with a dot), and there must not be two dots in a row
(i.e., the dots are all separated with non-dot characters).
Output Example:
a.b.
a.b.com
a.b.c.
a.b.c.com
But not:
a.com
a..b
a.b.c..d
I did this command:
grep -P '^[^.]+\.([^.]+\.)+[.]+' file.txt | wc -l
but it didn't find any matching lines.
How should I do this?
Best Answer
\.
and[.]
are equivalent — they both match a literal dot, and not any other character. As a matter of style, pick one and use it consistently.([^.]+\.)+
followed by[.]+
. That’s really (sort of) equivalent to[^.]+\.
followed by[.]
, with the result that your grep is looking for lines that containtext.text..
, i.e., two dots in a row. If you check, you’ll see that your command matchesa.b..
.[.]
to[^.]
(perhaps that’s what you meant originally?), change the following+
to an*
, and add a$
. After some number oftext.
groups, require/allow any number (zero or more) characters other than dot, up to the end of the line.grep
finds lines that begin with a non-dot character and include at least two dots. The secondgrep
removes lines that have two consecutive dots.grep … | wc -l
, just dogrep -c …
.