How to add 10 lines from a file (file2) to another one after 2 lines (file1)

text processing

I have two different files separated by Tab. File 1 looks like this:

transcr_15824   3.95253441295071    3.99992738843234    3.93880798313547
YML042W 10.3143219248979    10.6898819949325    11.0073811719421
transcr_18545   7.76182774638543    7.25508954643215    7.92562682485731
YCR105W 8.46144110056843    8.30995100411912    8.85470858413405
transcr_18545   7.76182774638543    7.25508954643215    7.92562682485731
YMR325W 6.2822794040082 6.46992587787936    7.00507748994596

File 2 looks like this:

YLR177W 11.321823973245 12.1264440368589    11.7777091957438
YOR117W 10.7514234580732    11.3932687209745    11.2587694561818
TY_120  5.95114867088525    5.93580053538449    5.89166059690558
YMR174C 8.49545850099485    8.72467418433346    9.6518559706269
YPL117C 10.7211879012765    10.5046713289602    10.6145538571844
TY2_LTR_77  11.9297940548212    11.9801206538102    12.049127298122
YOL101C 7.76141097131674    9.89522697916433    7.85466704627526
YLR053C 7.62843998411388    7.49205634213499    7.10263942962051
YBR135W 9.70614244227352    9.3114074341804 9.36413815370247
YNL168C 9.93928326709444    10.3036524361223    10.0704544058998

What I'm trying to do right now is to add 10 lines from File 2 to File 1 after 2 lines. It should look like this:

transcr_15824   3.95253441295071    3.99992738843234    3.93880798313547
YML042W 10.3143219248979    10.6898819949325    11.0073811719421
YLR177W 11.321823973245 12.1264440368589    11.7777091957438
YOR117W 10.7514234580732    11.3932687209745    11.2587694561818
TY_120  5.95114867088525    5.93580053538449    5.89166059690558
YMR174C 8.49545850099485    8.72467418433346    9.6518559706269
YPL117C 10.7211879012765    10.5046713289602    10.6145538571844
TY2_LTR_77  11.9297940548212    11.9801206538102    12.049127298122
YOL101C 7.76141097131674    9.89522697916433    7.85466704627526
YLR053C 7.62843998411388    7.49205634213499    7.10263942962051
YBR135W 9.70614244227352    9.3114074341804 9.36413815370247
YNL168C 9.93928326709444    10.3036524361223    10.0704544058998
transcr_18545   7.76182774638543    7.25508954643215    7.92562682485731
YCR105W 8.46144110056843    8.30995100411912    8.85470858413405

So, basically, I'm trying to move 10 lines from File 2 between each transcr_ keeping the already existent line that is already below each transcr_.

Edit:

File 2 has around 2,000 lines and File 1 has around 200 "transcr_" rows. So, it would be: pick up the first 10 lines of File 2, put them between the first and the second "transcr_" rows (and after the already existing line between those two "transcr_". Then, get the lines from 11 to 20 from File 2 and put them between the second and the third "transcr_". Then, get the lines from 21 to 30 from File 2 and put them between the third and the fourth "transcr_" and so on.

It may look like this:

transcr_1
already existing line
10 first lines from `File 2`
transcr_2
already existing line
Lines 11-20 from `File 2`
transcr_3
already existing line
Lines 21-30 from `File 2`
transcr_4
.....

Best Answer

You could use ed!

ed -s file1 <<< $'2r !head -10 file2\nw\nq'

This tells ed to edit file1 with three commands:

  1. on line 2, read in the output of the command head -10 file2 and insert it
  2. write the file out
  3. quit ed

With GNU sed (using the e extension, which pipes input from a shell command):

sed -i '3e head -10 file2' file1

Extended solution, to iterate through file2

The script below is a for loop that repeats the ed idea as many times as there are transcr_ blocks in file1. Each time through the loop, we calculate three items:

  1. the line number for ed to start reading from file1
  2. the line number for sed to start reading from file2
  3. the line number for sed to stop reading from file2

Item #1 is spelled out more clearly as: 10*(N-1) + 2*N, which I reduced to 12*N - 10.

Items #2 and #3 are spelled out more clearly as 10*(N-1) + 1 through 10*N, which I reduced to 10*N - 9 through 10*N.

I replaced the head command with the more flexible & powerful sed command for picking out blocks of lines from file2.

This will rewrite file1 times times as it goes through the loop.

# how many times we need to insert blocks
times=$(grep -c transcr_ file1)
for((index=1;index <= times; index++));
do
  printf "%dr !sed -n %d,%dp file2\nw\nq\n" $((12 * index - 10)) $((10 * index - 9)) $(( 10 * index ))  |
    ed -s file1
done
Related Question