Awk Field Separator Bug Explained

awk

This is the expected way for the awk field separator to work:

$ echo 'fooXbar' | awk 'BEGIN {FS="X"} {print $1}'
foo
$ echo 'fooXbar' | awk 'BEGIN {FS="X"} {print $2}'
bar
$ 

but if the FS is "-|-" then things get strange:

$ echo 'foo-|-bar' | awk 'BEGIN {FS="-|-"} {print $1}'
foo
$ echo 'foo-|-bar' | awk 'BEGIN {FS="-|-"} {print $2}'
|
$ echo 'foo-|-bar' | awk 'BEGIN {FS="-|-"} {print $3}'
bar
$ 

Why? Why is the $2 a "|" in the second example??

UPDATE:

$ echo 'foo-|-bar' | awk 'BEGIN {FS="-\|-"} {print $2}'
awk: warning: escape sequence `\|' treated as plain `|'
|
$ 

Best Answer

FS is a regex: awk interprets -|- as "- or -".

Use -\|- instead. You'll need to double the backslash inside the string literal.

echo 'foo-|-bar' | awk 'BEGIN {FS="-\\|-"} {print $1}'

or

echo 'foo-|-bar' | awk -F '-\\|-' '{print $2}'
Related Question