join
works great:
$ join <(sort File1.txt) <(sort File2.txt) | column -t | tac
id No P R S
gi|371443198|gb|JH556662.1| 7573913 2 2 0
gi|371440577|gb|JH559283.1| 6931777 21 19 2
ps. does ouput column order matter?
if yes use:
$ join <(sort 1) <(sort 2) | tac | awk '{print $1,$3,$4,$5,$2}' | column -t
id P R S No
gi|371443198|gb|JH556662.1| 2 2 0 7573913
gi|371440577|gb|JH559283.1| 21 19 2 6931777
awk 'FNR==1{f+=1;w++;}
f==1{if(length>w) w=length; next;}
f==2{printf("%-"w"s",$0); getline<f2; print;}
' f2=file2 file1 file1
Note: file1
is quite intentionally read twice; the first time is to find the maximum line length, and the second time is to format each line for the final concatenation with corresponding lines from file2. — file2
is read programatically; its name is provided by awk's variable-as-an-arg feature.
Output:
hi 1
wonderful 2
amazing 3
sorry 4
superman 5
superhumanwith 6
loss 7
To handle any number of input files, the following works.but *Note: it does not cope with repeating the same filename. ie each filename arg refers to a different file. It can, however, handle files of different lengths - beyond a files EOF, spaces are used.
awk 'BEGIN{ for(i=1; i<ARGC; i++) {
while( (getline<ARGV[i])>0) {
nl[i]++; if(length>w[i]) w[i]=length; }
w[i]++; close(ARGV[i])
if(nl[i]>nr) nr=nl[i]; }
for(r=1; r<=nr; r++) {
for(f=1; f<ARGC; f++) {
if(r<=nl[f]) getline<ARGV[f]; else $0=""
printf("%-"w[f]"s",$0); }
print "" } }
' file1 file2 file3 file4
Here is the output with 4 input files:
hi 1 cat A
wonderful 2 hat B
amazing 3 mat C
sorry 4 moose D
superman 5 E
superhumanwith 6 F
loss 7 G
H
Best Answer
With
paste
:though it would keep outputting lines after one of the files is exhausted if there are still lines left in the other file as in your sample.
With
awk
:Or the GNU
sed
equivalent:This time, stop as soon as
file1.txt
is exhausted but still carry on iffile2.txt
is exhausted (and output empty lines in theawk
variant and nothing in the GNUsed
variant).To stop as soon as either file is exhausted: