I have a file prova.txt
like this:
Start to grab from here: 1
fix1
fix2
fix3
fix4
random1
random2
random3
random4
extra1
extra2
bla
Start to grab from here: 2
fix1
fix2
fix3
fix4
random1546
random2561
extra2
bla
bla
Start to grab from here: 1
fix1
fix2
fix3
fix4
random1
random22131
and I need to grep out from "Start to grab here" to the first blank line. The output should be like this:
Start to grab from here: 1
fix1
fix2
fix3
fix4
random1
random2
random3
random4
Start to grab from here: 2
fix1
fix2
fix3
fix4
random1546
random2561
Start to grab from here: 1
fix1
fix2
fix3
fix4
random1
random22131
As you can see the lines after "Start to grab here" are random, so -A -B grep flag don't work:
cat prova.txt | grep "Start to grab from here" -A 15 | grep -B 15 "^$" > output.txt
Can you help me to find a way that catch the first line that will be grabbed (as "Start to grab from here"), until a blank line. I cannot predict how many random lines I will have after "Start to grab from here".
Any unix compatible solution is appreciate (grep, sed, awk is better than perl or similar).
EDITED: after brilliant response by @john1024, I would like to know if it's possible to:
1° sort the block (according to Start to grab from here: 1 then 1 then 2)
2° remove 4 (alphabetically random) lines fix1,fix2,fix3,fix4 but are always 4
3° eventually remove random dupes, like sort -u command
Final output shoul be like this:
# fix lines removed - match 1 first time
Start to grab from here: 1
random1
random2
random3
random4
#fix lines removed - match 1 second time
Start to grab from here: 1
#random1 removed cause is a dupe
random22131
#fix lines removed - match 2 that comes after 1
Start to grab from here: 2
random1546
random2561
or
# fix lines removed - match 1 first time and the second too
Start to grab from here: 1
random1
random2
random3
random4
#random1 removed cause is a dupe
random22131
#fix lines removed - match 2 that comes after 1
Start to grab from here: 2
random1546
random2561
The second output is better that the first one. Some other unix command magic is needed.
Best Answer
Using awk
Try:
/Start to grab/,/^$/
defines a range. It starts with any line that matchesStart to grab
and ends with the first empty line,^$
, that follows.Using sed
With very similar logic:
-n
tells sed not to print anything unless we explicitly ask it to./Start to grab/,/^$/p
tells it to print any lines in the range defined by/Start to grab/,/^$/
.