Shell – Insert a newline after a broken sequence of numbers awk/unix/shell scripting

I have a huge file to process and I could not manage to get exactly what I need.
Please note that I do not know before hand how many times this occurs in one file (e.g., could happen > 1000 times per files).

Below is my input file (TAB delimited), where $1 is the line number. The broken sequence of numbers occurs at $3:

797  47 M797    1     365.0     0.05     0.05 A 0.825
798  47 M798    1     365.0     0.05     0.05 A 0.825
799  47 M799    1     365.0     0.70     0.70 A 0.404
800  47 M800    1     365.0     0.00     0.00 A 0.990
801  47 M802    1     365.0     0.29     0.29 A 0.591
802  47 M803    1     365.0     0.12     0.12 A 0.726

This is what I want:

797  47 M797    1     365.0     0.05     0.05 A 0.825
798  47 M798    1     365.0     0.05     0.05 A 0.825
799  47 M799    1     365.0     0.70     0.70 A 0.404
800  47 M800    1     365.0     0.00     0.00 A 0.990
801  
802  47 M802    1     365.0     0.29     0.29 A 0.591
803  47 M803    1     365.0     0.12     0.12 A 0.726

This is the code I manage to write so far (filename is test.sh):

awk '
   marker=substr($3,2,6)
   { if (FNR < marker) {printf "\n"}
    }' ${1}

This is the output I got so far:

797       47 M797   1     365.0     0.05     0.05 A 0.825
798       47 M798   1     365.0     0.05     0.05 A 0.825
799       47 M799   1     365.0     0.70     0.70 A 0.404
800       47 M800   1     365.0     0.00     0.00 A 0.990
801       47 M802   1     365.0     0.29     0.29 A 0.591

802       47 M803    1    365.0     0.12     0.12 A 0.726

803       47 M804    1    365.0     0.08     0.08 A 0.777

If anyone has a better solution for this, please let me know.

# !/usr/bin/python import sys def print_fixed_sequence(filename, line_num=0): with open(filename, 'rU') as f: for line in (x.strip() for x in f): _, f1, f2, data = line.split('\t', 3) rec_num = int(f2[1:]) while line_num != rec_num: print(line_num) line_num += 1 print('\t'.join((str(line_num), f1, f2, data))) line_num += 1 print_fixed_sequence(sys.argv[1], line_num=795)

795 796 797 47 M797 1 365.0 0.05 0.05 A 0.825 798 47 M798 1 365.0 0.05 0.05 A 0.825 799 47 M799 1 365.0 0.70 0.70 A 0.404 800 47 M800 1 365.0 0.00 0.00 A 0.990 801 802 47 M802 1 365.0 0.29 0.29 A 0.591 803 47 M803 1 365.0 0.12 0.12 A 0.726

Shell – Insert a newline after a broken sequence of numbers awk/unix/shell scripting

Best Answer

Code:

Results:

Related Question

Best Answer

Code:

Results:

Related Solutions

Use awk to insert a line after N output

Related Question