AWK – Managing Paragraphs of 4 Lines

awksedtext processing

I have a file composed of several paragraphs (more than 2000) of 4 lines.
For each paragraph, I need to match the content between brackets like the example below.

So for each paragraph,

  • the entries are the first two lines.
  • for the third line, the current content between the brackets is replaced by the content between the second line brackets.
  • for the fourth line, the current content between the brackets is replaced by the content between the first line brackets.

I hope it's clear enough.

–Inputs–

A1 [A3 A4 A5] A2
B1 [B3 B4 B5] B2
C1 [C3 C4] C2
D1 [D3 D4] D2

E1 [E3 E4 E5] E2
F1 [F3 F4 F5] F2
G1 [G3 G4] G2
H1 [H3 H4] H2

–Outputs–

A1 [A3 A4 A5] A2
B1 [B3 B4 B5] B2
C1 [B3 B4 B5] C2
D1 [A3 A4 A5] D2

E1 [E3 E4 E5] E2
F1 [F3 F4 F5] F2
G1 [F3 F4 F5] G2
H1 [E3 E4 E5] H2

Do you have a solution? With awk and gsub I guess but how it's the problem.

Best Answer

awk -F[][] -vOFS= '++i==1 {a=$2} i==2 {b=$2} i==3 {$2="[" b "]"} i==4 {$2="[" a "]"} !NF {i=0} 1' input.txt

With square brackets as the field separators, your replacement sources/targets are in $2.

We increment i on each line, and reset it to zero between paragraphs. The value of i (1 though 4) tells us what to do with $2.

Related Question