How to Merge Multiple Files with Join Command

awkjoin;text processing

Is there some workaround to join multiple files at once based on the first column? Usually, I would do:
join File1 File2 > File1+File2 and File1+File2 File3 > final_output

Example files:

File1:

1 test1
2 test3
3 test4
4 test5
7 test7

File2:

1 example1
2 example2
3 example3
4 example4
8 example8

File3:

1 foo1
2 foo2
3 foo3
4 foo4
10 foo10

Considering that f.e. fifth line may differs in each file, and there is n number of files.
Edit:

Example output:

1 test1 example1 foo1
2 test2 example2 foo2
3 test3 example3 foo3
4 test4 example4 foo4

On the other hand, I am not sure how lines that don't match in column1 will be processed (fifth line)
Thanks

Best Answer

Basically like this for your 3 files example

$ join file2 file3| join file1 -
1 test1 example1 foo1
2 test3 example2 foo2
3 test4 example3 foo3
4 test5 example4 foo4

But important all your input files must be sorted already (sort -k 1b,1, numerical sorted like your example may not work!). So the example above sorted on-the-fly could be written in bash like this:

join <(sort -k 1b,1 file2) <(sort -k 1b,1 file3) | join <(sort -k 1b,1 file1) -\
  | sort -k 1n,1

And finally the generic case for n files using a recursive function (tested in bash).:

xjoin() {
    local f
    local srt="sort -k 1b,1"

    if [ "$#" -lt 2 ]; then
            echo "xjoin: need at least 2 files" >&2
            return 1
    elif [ "$#" -lt 3 ]; then
            join <($srt "$1") <($srt "$2")
    else
            f=$1
            shift
            join <($srt "$f") <(xjoin "$@")
    fi
}

xjoin file1 file2 file3 | sort -k 1n,1

If you know what you are doing you may omit the sort pipes. But from my experience join without explicit sort is very often the cause of trouble.

Related Question