Parsing CSV with AWK to Produce HTML Output

awkcsvhtml

I have a data file with rows of comma-separated fields like this:

United Kingdom, GB, +44

and I want to produce the following output for each line in the file:

<option value="GB">United Kingdom +44</option>

I got as far as follows with awk, but after adding the angle brackets I am getting mangled output:

BEGIN{FS=",";}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
{                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
    print "<option value=\"" $2 "\">";                                                                                                                                                                                                                         

}

Best Answer

awk's printf function can be easier to use when it comes to embedded quotes.

awk -F, '{printf("<option value=\"%s\">%s %s</option>\n", $2, $1, $3)}'

The problem with that is that you also have whitespace between each field. We can use the gsub function to trim each field.

awk -F, '{gsub(/^ +| +$/,"", $2); printf("<option value=\"%s\">%s %s</option>\n", $2, $1, $3)}'

Or what's easier is to change our field separator: awk -F' *, *' '{printf("%s %s\n", $2, $1, $3)}'

If you need to trim multiple fields then it might be better to use a loop or a function (depending on the situation). See https://stackoverflow.com/questions/9985528/how-can-i-trim-white-space-from-a-variable-in-awk for more info.

Related Question