Say I have a file:
# file: 'test.txt'
foobar bash 1
bash
foobar happy
foobar
I only want to know what words appear after "foobar", so I can use this regex:
"foobar \(\w\+\)"
The parenthesis indicate that I have a special interest in the word right after foobar. But when I do a grep "foobar \(\w\+\)" test.txt
, I get the entire lines that match the entire regex, rather than just "the word after foobar":
foobar bash 1
foobar happy
I would much prefer that the output of that command looked like this:
bash
happy
Is there a way to tell grep to only output the items that match the grouping (or a specific grouping) in a regular expression?
Best Answer
GNU grep has the
-P
option for perl-style regexes, and the-o
option to print only what matches the pattern. These can be combined using look-around assertions (described under Extended Patterns in the perlre manpage) to remove part of the grep pattern from what is determined to have matched for the purposes of-o
.The
\K
is the short-form (and more efficient form) of(?<=pattern)
which you use as a zero-width look-behind assertion before the text you want to output.(?=pattern)
can be used as a zero-width look-ahead assertion after the text you want to output.For instance, if you wanted to match the word between
foo
andbar
, you could use:or (for symmetry)