Find common elements in a given column from two files and output the column values from each file

awkcommjoin;

I have two files with tab-separated values that look like this:

file1:

file2:

I would like to find rows between files 1 and 2 where the string in column 1 is the same, then get the corresponding values. The desired output is a single file that looks like this:

B    3    3
C    1    2
A    1    9

Can this be done with a Unix one-liner?

Best Answer

GNU coreutils includes the command join that does exactly what you want if line sorting in the result is irrelevant:

join <(sort file1) <(sort file2)

A 1 9
B 3 3
C 1 2

If you want the tabs back, do:

join <(sort file1) <(sort file2) | tr ' ' '\t'

A   1   9
B   3   3
C   1   2

Or use the t option to join.

(<() aka process substitution, requires ksh93 (where the feature originated in), bash or zsh)

Related Solutions

Merging two files according to a common column

Your files were created in Windows so they have Windows style line-endings (\r\n). Remove the \r and everything should work as you expect:

sed -i 's/\r//' File1
sed -i 's/\r//' File2
awk 'FNR==NR{a[$4]=$5;next} {print $1,$2,$3,$4,a[$4]}' File2 File1 > file3

Best Answer

Related Solutions

Merging two files according to a common column

Related Question