CSV – Determine Maximum Column Length for Every Column in a Simplified CSV File

bashcsv-simpleshell-scripttext processing

To determine the maximum length of each column in a comma-separated csv-file I hacked together a bash-script. When I ran it on a linux system it produced the correct output, but I need it to run on OS X and it relies on the GNU version of wc that can be used with the parameter -L for --max-line-length.

The version of wc on OSX does not support that specific option and I'm looking for an alternative.

My script (which not be that good – it reflects my poor scripting skills I guess):

#!/bin/bash

for((i=1;i< `head -1 $1|awk '{print NF}' FS=,`+1 ;i++));
    do echo  | xargs echo -n "Column$i: " && 
    cut -d, -f $i $1 |wc -L  ; done

Which prints:

Column1: 6
Column2: 7
Column3: 4
Column4: 4
Column5: 3

For my test-file:

123,eeeee,2323,tyty,3
154523,eegfeee,23,yty,343

I know installing the GNU CoreUtils through Homebrew might be a solution, but that's not a path I want to take as I'm sure it can be solved without modifying the system.

Best Answer

why not use awk ?

I don't have a mac to test, but length() is a pretty standard function in awk, so this should work.

awk file:

 { for (i=1;i<=NF;i++) {
    l=length($i) ;
    if ( l > linesize[i] ) linesize[i]=l ;
  }
}
END {
    for (l in linesize) printf "Columen%d: %d\n",l,linesize[l] ;
}

then run

mybox$ awk -F, -f test.awk  a.txt
Columen4: 4
Columen5: 3
Columen1: 6
Columen2: 7
Columen3: 4
Related Question