Text Processing – Search for a String and Print Everything Before and After

sedtext processing

I have this file:

sometext1{
string1
}

sometext2{
string2
string3
}

sometext3{
string4
string5
string6
}

I want to search this file for a specific string and print everything before this string up to the opening { and everything after this string up to the closing }. I tried to achieve this with sed but if I try to print everything in the range /{/,/string2/ for example sed prints this:

sometext1{
string1
}

sometext2{
string2
sometext3{
string4
string5
string6
}

If I search for the string "string2" I need the output to be:

sometext2{
string2
string3
}

Thanks.

Best Answer

Here are two commands. If you want a command that trims up to the last .*{$ line in a sequence (as @don_crissti does with ed) you can do:

sed 'H;/{$/h;/^}/x;/{\n.*PATTERN/!d'

...which works by appending every line to Hold space following a \newline character, overwriting hold space for every line that matches {$, and swapping ing hold and pattern spaces for every line that matches ^} - and thereby flushing its buffer.

It only prints lines which match a { then a \newline and then PATTERN at some point - and that only ever happens immediately following a buffer swap.

It elides any lines in a series of {$ matches to the last in the sequence, but you can get all of those inclusive like:

sed '/PATTERN.*\n/p;//g;/{$/,/^}/H;//x;D'

What it does is swap pattern and hold spaces for every ...{$.*^}.* sequence, appends all lines within the sequence to Hold space following a \newline character, and Deletes up to the first occurring \newline character in pattern space for every line cycle before starting again with what remains.

Of course, the only time it ever gets \newline in pattern space is when an input line matches ^} - the end of your range - and so when it reruns the script on any other occasion it just pulls in the next input line per usual.

When PATTERN is found in the same pattern space as a \newline, though, it prints the lot before overwriting it with ^} again (so it can end the range and flush the buffer).

Given this input file (thanks don):

sometext1{
string1
}

sometext2{
PATTERN
string3
}

sometext3{
string4
string5
string6
}

Header{
sometext4{
some string

string unknown

here's PATTERN and PATTERN again
and PATTERN too
another string here
}
}

The first prints:

sometext2{
PATTERN
string3
}
sometext4{
some string

string unknown

here's PATTERN and PATTERN again
and PATTERN too
another string here
}

...and the second...

sometext2{
PATTERN
string3
}
Header{
sometext4{
some string

string unknown

here's PATTERN and PATTERN again
and PATTERN too
another string here
}

Best Answer

Related Solutions

Regular expression: not containing string

Related Question