Consider the following toy example:
this is a line
this line contains FOO
this line is not blank
This line also contains FOO
Some random text
This line contains FOO too
Not blank
Also not blank
More random text
FOO!
Yet more random text
FOO!
So, I want the results of a grep for FOO, but with the extra wrinkle that lines following the matching lines should be included, as long as they are not blank, and they do not themselves contain FOO. So the matches would look as follows, with the different matches separated:
MATCH 1
this line contains FOO
this line is not blank
MATCH 2
This line also contains FOO
MATCH 3
This line contains FOO too
Not blank
Also not blank
MATCH 4
FOO!
Yet more random text
MATCH 5
FOO!
Bonus points (metaphorically speaking) for a simple single line script that can be run on the command line.
ADDENDUM: Adding a running count of the match number would be quite handy, if it is not too hard.
Best Answer
Using
awk
rather thangrep
:A version that enumerates the matches:
Both
awk
programs uses a very simple "state machine" to determine if it's currently matching or not matching. A match of the patternFOO
will cause it to enter thematching
state, and a match of the pattern^$
(an empty line) will cause it to enter the non-matching
state.Output of empty lines between matching sets of data happens at state transitions from
matching
(either intomatching
or into non-matching
).The first program prints any line when in the
matching
state.The second program collects lines in a
buf
variable when in amatching
state. It flushes (empties) this after possibly printing it (depending on the state), together with aMatch N
label at state transitions (when the first program would output an empty line).Output of this last program on the sample data: