I found an answer on another site that was suggesting grep -oP '^\w+|$
. I pointed out that the |$
is pointless in PCRE, since it just means "OR end of line" and will therefore always be true for regular lines. However, I can't exactly figure out what it does in GNU grep
PCREs when combined with -o
. Consider the following:
$ printf 'ab\na\nc\n\n' | perl -ne 'print if /ab|$/'
ab
a
c
$
(I am including the second prompt ($
) character to show that the empty line is included in the results).
As expected, in Perl, that will match every line. Either because it contains an ab
or because the $
matches the end of the line. GNU grep
behaves the same way without the -o
flag:
$ printf 'ab\na\nc\n\n' | grep -P 'ab|$'
ab
a
c
$
However, -o
changes the behavior:
$ printf 'ab\na\nc\n\n' | grep -oP 'ab|$'
ab
$
This is the same as simply grepping for ab
. The second part, the "OR end of line" seems to be ignored. It does work as expected without the -o
flag:
What's going on? Does –o
ignore 0-length matches? Is that a bug or is it expected?
Best Answer
My GNU grep man page says the following:
emphasis is mine
I'm guessing it considers the end of line match to be an "empty match"