Stack data from multiple delimited files into one, with variable columns

awkcsvfilesmergetext processing

I'm new to scripting. Got struck with a file merge issue in Unix. Was looking for some direction and stumbled upon this site. I saw many great posts and replies but couldn't find a solution to my issue. Greatly appreciate any help..

I have three csv files -> Apex_10_Latest.csv, Apex_20_Latest.csv, Apex_30_Latest.csv. Number of columns are varying in these 3 files. Typically the latest file, based on the numbering, might have some new columns appended to the end. So I want to take the latest header and stack the data from all the 3 files into a new file Apex.csv. When stacking the data from older file which might have less columns than latest file, I want the data to be populated as null with appropriate delimiters..

Also this has to be done recursively for a multiple set of files (3 each), all in the same folder.
– Apex_10_Latest.csv,Apex_20_Latest.csv,Apex_30_Latest.csv – merged into Apex.csv
– Code_10_Latest.csv,Code_20_Latest.csv,Code_30_Latest.csv – merged into Code.csv
– Trans_10_Latest.csv,Trans_20_Latest.csv,Trans_30_Latest.csv – merged into Trans.csv

Following is the format of the source files and expected target file…
SOURCE FILES:

  • Apex_30_Latest.csv:
    A,B,C,D
    1,2,3,4
    2,3,4,5
    3,4,5,6

  • Apex_20_Latest.csv:
    A,B,C
    4,5,6
    5,6,7
    6,7,8

  • Apex_10_Latest.csv:
    A,B
    7,8
    8,9
    9,10

EXPECTED TARGET FILE:

  • Apex.csv
    A,B,C,D
    1,2,3,4
    2,3,4,5
    3,4,5,6
    4,5,6,,
    5,6,7,,
    6,7,8,,
    7,8,,,
    8,9,,,
    9,10,,,

Thanks…

Best Answer


with Miller (http://johnkerl.org/miller/doc/) as usual is very easy

mlr --csv unsparsify Apex_*_Latest.csv

gives you

A,B,C,D
1,2,3,4
2,3,4,5
3,4,5,6
4,5,6,
5,6,7,
6,7,8,
7,8,,
8,9,,
9,10,,
Related Question