Using ‘ (‘ (space followed by parenthesis) as field separator in awk

awkregex

In an awk script I am trying to use ' (' as the field separator.
However unless I escape the bracket with double back-slash, like this:

BEGIN {FS=" \\("}

it does not work.

If I use FS=" \(" I get

awk: prog:2: warning: escape sequence `\(' treated as plain `('
awk: prog:2: fatal: :, [., or [=: / (/

output and if I don't escape the bracket at all I get just the

awk: prog:2: fatal: :, [., or [=: / (/ message.

Can you please explain this behaviour?

Best Answer

To use ␣( (space+parenthesis) as field separator in awk, use "␣\\\(":

$ echo "a (b (c" | awk -F " \\\(" '{ print $1; print $2; print $3 }'
a
b
c

Alternatively, use single quotes and two backslashes:

$ echo "a (b (c" | awk -F ' \\(' '{ print $1; print $2; print $3 }'
a
b
c

The reason for this is that ␣( (a single parenthesis with a leading space) is a malformed regular expression. The left parenthesis opens a grouping that is never closed. This is why it needs to be escaped.

The reason that ( (a single parenthesis without a leading space) works is that when FS is a single character, it's not treated as a regular expression.