Retrieving lines from a file depending on other lines

awkgrepsedtext processing

Imagine the following file structure:

foo.bar.1
blabla
moreblabla
relevant=yes
foo.bar.2
relevant=no
foo.bar.3
blablabla
foo.bar.4
relevant=yes

I want to retrieve all foo.bar lines where within the block following themselves and before the next foo.bar there is a line stating relevant=yes.

So the output should be:

foo.bar.1
foo.bar.4

I could of course write a program/script iterating through the lines, remembering the foo.bars and print them when there is a line saying relevant=yes following them an before the next foo.bar. But I thought there might be an out-of-the box way using standard Unix utilities (grep/sed/awk)?

Thanx for any hints!

Best Answer

If the input is processed line by line, then processing needs to go like this:

  • if the current line is foo.bar, store it, forgetting any previous foo.bar line that wasn't enabled for output;
  • if the current line is relevant=yes, this enables the latest foo.bar for output.

This kind of reasoning is a job for awk. (It can also be done in sed if you like pain.)

awk '
    /^foo\.bar/ { foobar = $0 }
    /^relevant=yes$/ {if (foobar != "") {print foobar; foobar = ""}}
'
Related Question