Bash – How to Edit a Multiline Pattern (Sed and Awk Available)

awkbashregexsed

So here is my very specific task. I have patterns of type

.*[class|namespace|struct].*

(this is one part of the pattern, to be precise),

then an arbitrary amount of newlines \n or spaces intermixed (may be 0 newlines, but at least 1 space), and right after that I have a very specific pattern – I basically know that this line always comes in the way of

{specific_text;^

This was the second part of the pattern. That is it.

Notes: if there is part one of the pattern, then there definitely exists part 2.

The task

I need to replace each line with custom text that comes right before the corresponding part one pattern, into single curly brace and a newline symbol, like this – ''{\n''.

How do I do it with a bash shell, awk and sed?

Note. So you can see that basically I want to change the first line that matches my very specific pattern, AND which is the first among its kind after the class|namespace|struct keyword.

Example.

class A { specific_text;

is transformed into

class A {

And this

class A

{ specific_text;

becomes

class A

{

Please note in the last example we preserved the newlines.

Best Answer

sed '/class\|namespace\|struct/ {
   : loop
   s/{ *specific_text;$/{/
   t
   n
   b loop
   }'

It works like this:

  • Until class or namespace or struct is found, sed does nothing more than its default action: it prints incoming lines as they are.
  • When class or namespace or struct is found, sed enters a loop.
    • : loop is a label, beginning of the loop.
    • b loop branches to the label, it's the end of the loop.
    • There are two ways to leave the loop:
      • end of input,
      • t which branches to the end of the script.
    • t will only work if the preceding s is successful.
    • If t doesn't work, the script continues to n which prints the current line and moves to the next.

In other words both the outside and the inside of the loop print incoming lines, except only in the loop we are trying to do the substitution with s. A successful substitution leaves the loop. The tool will loop again if another class or namespace or struct is encountered.

In yet other words this is exactly what you want (I think):

to change the first line that matches my very specific pattern, AND which is the first among its kind after the class|namespace|struct keyword.

Notes:

  • The script doesn't care what is between class (or namespace or struct) and the specific pattern.
  • I adjusted your patterns. In particular
    • I allowed spaces between { and specific_text because your test cases have spaces;
    • I used $ instead of ^, because ^ in your pattern is literal, it makes no sense (maybe you confused ^ and $ anchors).

Example input:

foo
bar
class A {      specific_text;
baz { specific_text;
qux
this is my namespace

whatever
abc {specific_text;
foo
{ specific_text;
{specific_text;

The output:

foo
bar
class A {
baz { specific_text;
qux
this is my namespace

whatever
abc {
foo
{ specific_text;
{specific_text;

Altered lines:

  • class A { specific_text;
  • abc {specific_text; (because of namespace few lines before)
Related Question