SED and GREP – Get All Regex Matches Between Two Patterns

grepregular expressionsed

I've got a file with a bunch of long lines. I'd like to grab every group between two patterns and print them to a new file, one match per line. I could manage to do this with Python, but I'd prefer using just command line tools for this task. If there is no end pattern, I'd like to grab everything 'till the end of the line.

Something like:

input: 
xxSTART relevanttext xxEND something else xxSTART even more relevant

output:
relevanttext
even more relevant

Best Answer

IF GNU grep is an option, you could pass the -P (perl-compatible regex) flag and use lookahead assertions, lookbehind assertions and non-greedy matches to pull out what you need

echo 'xxSTART relevanttext xxEND something else xxSTART even more relevant'  |\
grep -oP '(?<=START).*?(?=xxEND|$)'
relevanttext
even more relevant

Or as Stephane Chazelas suggests, use the nifty \K in place of the look-behind assertion

echo 'xxSTART relevanttext xxEND something else xxSTART even more relevant'  |\
grep -oP 'START\K.*?(?=xxEND|$)' 
Related Question