Text Processing – Get 2 lines with exact text between them

awksedtext processing

I have file with unknown number of blocks of text consisting of starting keyword "Start", ending keyword "End" and optional text between them with one exact keyword "Disk" on every line and I need to get rid of the ones where there is nothing between them, see the example.

I am processing input like this:

Server1:Start
Server1:End
Server2:Start
Disk1
Disk2
Server2:End
Server3:Start
Disk1
Server3:End

, and my desired output is this:

Server2:Start
Disk1
Disk2
Server2:End
Server3:Start
Disk1
Server3:End

I know, that I can use 'awk' or 'sed' to find text between 2 lines, but I do not know what to do, if there are multiple occurrences of these 2 lines or if there is no text between these 2 lines.

I am running Ubuntu 17.10.

Looking forward to any help.

edit: I deleted the post first time, because I thought that I can do it using sed -e '/Start/,/End/d', but this actually removes everything.

Best Answer

To delete back-to-back Start and End lines, this should do in GNU sed:

$ sed -e '/Start/ {N; /^\(.*\):Start\n\1:End$/d }' < input

if we see Start, load the next line with N, then see if the contents of the buffer are just Somename:Start\nSomename:End with Somename same on both lines (\n is a newline). If so, delete it. Here, \1 is a reference to the first group within \(..\), and matches the same string that was encountered there. .* just means any number (*) of any characters (.).

Using sed -e '/Start/,/End/d' would indeed delete every single line, since the range matches all lines between the starting and ending patterns. Everything in the input is between Start and End, so everything is deleted.

Related Question