CSV file processing – remove quotes and replace comma delimiter with tab

awkcsvrheltext processing

Need to convert delimited files with a shell command. There are two variations for the received input file, one with double quotes and another without quotes and both files have the comma as a delimiter. The requirement is to replace the comma with TAB and remove the quotes if the file has double quotes else just replace the comma. A file is sent with quotes if its fields also have comma that needs to be ignored while transforming.
The single command should be effective in Awk 3.x on RHEL 6.x environment.

eg. File 1 with double quotes:

"Jhon","Carpenter","CA,TX,NJ"
"Mike","Painter","WA,GA,MI"

Post transformation it should be TAB separated:

Jhon   Carpenter   CA,TX,NJ 
Mike   Painter     WA,GA,MI 

eg. File 2 without double quotes:

EMP1,123456,CA 
EMP2,456789,TX 

Post transformation it should be TAB separated:

EMP1 123456   CA 
EMP2   456789   TX

Best Answer

This short sed script can process both types of files (or even mixed ones with lines of first and second type):

sed '/"/!s/,/\t/g;s/","/\t/g; s/"//g'

As it doesn't group expressions not loop it should be way faster than your script.

You seem to have GNU sed, so \t works, otherwise use the literal TAB instead.