Find common elements in a given column from two files and output the column values from each file

awkcommjoin;

I have two files with tab-separated values that look like this:

file1:

A    1
B    3
C    1
D    4

file2:

E    1
B    3
C    2
A    9

I would like to find rows between files 1 and 2 where the string in column 1 is the same, then get the corresponding values. The desired output is a single file that looks like this:

B    3    3
C    1    2
A    1    9

Can this be done with a Unix one-liner?

Best Answer

GNU coreutils includes the command join that does exactly what you want if line sorting in the result is irrelevant:

join <(sort file1) <(sort file2)

A 1 9
B 3 3
C 1 2

If you want the tabs back, do:

join <(sort file1) <(sort file2) | tr ' ' '\t'

A   1   9
B   3   3
C   1   2

Or use the t option to join.

(<() aka process substitution, requires ksh93 (where the feature originated in), bash or zsh)

Related Question