Lum – merge csv files by first column

awkcolumnscsvjoin;text processing

I have 3 csv files like this.

csv 1:

1,aaaa,bbb,2014-04-01
2,qwe,rty,2014-04-03
3,zxc,cvb,2014-04-05

csv 2:

2,j,k,2014-04-01
3,a,s,2014-04-04
5,g,h,2014-04-08

csv 3:

2,a,s,d,f,g,2014-04-01
3,d,f,g,h,j,2014-04-06
4,c,v,b,n,m,2014-04-09

How can I merge all by the first column?

SELECT * FROM csv1
JOIN csv2 where csv1[0]= csv2[0] --[0] is the position of the first column

The output should be:

 csv1 fields     | csv2 fields |  csv4 fields

 2,qwe,rty,2014-04-03,a,s,2014-04-04,a,s,d,f,g,2014-04-01
 3,zxc,cvb,2014-04-05,g,h,2014-04-08,d,f,g,h,j,2014-04-06  

Best Answer

You can do this entirely with POSIX-specified features of join.

join -t, csv[12] | join -t, - csv3

Using your csv1, csv2 and csv3 files as posted, that gives:

$ join -t, csv[12] | join -t, - csv3
2,qwe,rty,2014-04-03,j,k,2014-04-01,a,s,d,f,g,2014-04-01
3,zxc,cvb,2014-04-05,a,s,2014-04-04,d,f,g,h,j,2014-04-06
Related Question