How to interleave two txt files with alternative number of lines

awkcatsedtext processing

file1.txt:

file2.txt:

A
B
C
D
E

Desired output in 3 : 1 ratio (file3.txt)

Commands I have tried:

sed Rfile2.txt file1.txt >file3.txt
paste -d '\n' file1.txt file2.txt >file3.txt

Best Answer

With paste:

paste -d '\n' <file1.txt - - - file2.txt

though it would keep outputting lines after one of the files is exhausted if there are still lines left in the other file as in your sample.

With awk:

awk '{print}; NR % 3 == 0 {getline < "file2.txt"; print}' file1.txt

Or the GNU sed equivalent:

sed '3~3 R file2.txt' file1.txt

This time, stop as soon as file1.txt is exhausted but still carry on if file2.txt is exhausted (and output empty lines in the awk variant and nothing in the GNU sed variant).

To stop as soon as either file is exhausted:

awk '{print}
     NR % 3 == 0 {
       if ((getline < "file2.txt") <= 0) exit
       print
     }' file1.txt

Related Solutions

Awk Join – How to Join Two Files with Matching Columns

join works great:

$ join <(sort File1.txt) <(sort File2.txt) | column -t | tac
 id                           No       P   R   S
 gi|371443198|gb|JH556662.1|  7573913  2   2   0
 gi|371440577|gb|JH559283.1|  6931777  21  19  2

ps. does ouput column order matter?

if yes use:

$ join <(sort 1) <(sort 2) | tac | awk '{print $1,$3,$4,$5,$2}' | column -t
 id                           P   R   S  No
 gi|371443198|gb|JH556662.1|  2   2   0  7573913
 gi|371440577|gb|JH559283.1|  21  19  2  6931777

Adjust gap between 2 columns to make them look straight

awk 'FNR==1{f+=1;w++;}
     f==1{if(length>w) w=length; next;}
     f==2{printf("%-"w"s",$0); getline<f2; print;}
    ' f2=file2 file1 file1

Note: file1 is quite intentionally read twice; the first time is to find the maximum line length, and the second time is to format each line for the final concatenation with corresponding lines from file2. — file2 is read programatically; its name is provided by awk's variable-as-an-arg feature.

Output:

hi             1
wonderful      2
amazing        3
sorry          4
superman       5
superhumanwith 6
loss           7

To handle any number of input files, the following works.but *Note: it does not cope with repeating the same filename. ie each filename arg refers to a different file. It can, however, handle files of different lengths - beyond a files EOF, spaces are used.

awk 'BEGIN{ for(i=1; i<ARGC; i++) { 
              while( (getline<ARGV[i])>0) { 
                 nl[i]++; if(length>w[i]) w[i]=length; }
              w[i]++; close(ARGV[i])
              if(nl[i]>nr) nr=nl[i]; }
            for(r=1; r<=nr; r++) {
              for(f=1; f<ARGC; f++) {
                if(r<=nl[f]) getline<ARGV[f]; else $0=""  
                printf("%-"w[f]"s",$0); } 
              print "" } }
    ' file1 file2 file3 file4

Here is the output with 4 input files:

hi             1 cat   A 
wonderful      2 hat   B 
amazing        3 mat   C 
sorry          4 moose D 
superman       5       E 
superhumanwith 6       F 
loss           7       G 
                       H