Text Processing – How to Delete All Text Between Nested Curly Brackets in a Multiline Text File

text processing

This question comes from
How can I delete all text between curly brackets in a multiline text file? (just the same, but without the requirements for nesting).

Example:

This is {
{the multiline
text} file }
that wants
{ to {be
changed}
} anyway.

Should become:

This is 
that wants
 anyway.

Is it possible to do this with some sort of one-line bash command (awk, sed, perl, grep, cut, tr… etc)?

Best Answer

$ sed ':again;$!N;$!b again; :b; s/{[^{}]*}//g; t b' file3
This is 
that wants
 anyway.

Explanation:

  • :again;$!N;$!b again

    This reads in the whole file.

    :again is a label. N reads in the next line and $!N reads in the next line on the condition that we are not already at the last line. $!b again branches back to the again label on the condition that this is not the last line.

  • :b

    This defines a label b.

  • s/{[^{}]*}//g

    This removes text in braces as long as the text contains no inner braces.

  • t b

    If the above substitute command resulted in a change, jump back to label b. In this way, the substitute command is repeated until all brace-groups are removed.