Group matching with grep includes extra characters

grepregular expression

I wanted to extract some text with regex in bash, so I decided to try the following simple example out.

echo "abc def ghi" | grep -Po " \K(.*?) "

I was expecting to get a "def", but to my surprise a "def " (with a final extra space) was what I got.

I'm interested in understanding why grep also includes the extra space at the end and how to get rid of it. I know I could post-process the result with another line but I'm interested in solving this with grep.

Best Answer

In short:

\K

causes grep to keep everything prior to the \K and not include it in the match. It does not affect what comes after the \K().

This might be enough:

" \K(.+)(?= )"

Where (?= ) is a non capturing group.

or perhaps better:

" \K([^ ]+)(?= )"
" \K(\w+)(?= )"

or similar.

Related Question