I have always been using single-quotes for the field separation like: awk -F';' ...
Quite new to me is the way of using a backslash like: awk -F\; ...
is there a technical difference for either, or is it just a matter of preference?
awkquotingshell
I have always been using single-quotes for the field separation like: awk -F';' ...
Quite new to me is the way of using a backslash like: awk -F\; ...
is there a technical difference for either, or is it just a matter of preference?
Best Answer
That's all to do with your shell, not with
awk
.In Bourne-like shells,
\
,'...'
and"..."
are all quoting operators.Quoting removes the special meaning a character may have in the syntax of the shell.
\
quotes a single character (except for newline which it removes instead),'...'
and"..."
can quote more than one (with"..."
not quoting every character).;
is a special character in the syntax of the shell. It's used to separate commands. You want to quote it if you want to pass it verbatim to a command.\;
,';'
will do.";"
will also do as;
is not one of those characters that are still special within double quotes, but you'd need"\\"
to pass one literal backslash to a command because\
is one of those characters that are still special within"..."
(though it's then only special when followed by other special characters within"..."
like that"
itself).Again that very much depends on the shell. In the
rc
shell for instance,\
and"
are not special let alone quoting characters,-F\;
wouldn't work there as the command would be parsed as both theawk -F\
and...
command separated with;
.See How to use a special character as a normal one? for more details.
To complicates things further, note that the argument to
-F
itself also goes through one or two layers of backslash processing by awk.awk
processes first the argument it receives to expand ANSI C escape sequences in it. If you useawk -F '\t'
orawk -F \\t
orawk -F "\\t"
orawk -F "\t"
,awk
receives an argument that contains\t
, which it expands to a TAB character. TheFS
awk variable will contain a TAB character, not\t
.With
awk -F '\\'
,awk
receives a\\
argument and setsFS
to the\
character. Strictly speaking,awk -F '\'
would is unspecified as that escape sequence is unfinished but in practice, except for busyboxawk
, allawk
implementations I know treat it the same asawk -F '\\'
.In
awk
, whenFS
contains a single character, that character is the field separator.awk -F .
splits the records on dot characters.However when
FS
contains more than one character, it is interpreted as a regular expression.awk -F ..
doesn't spilt on sequences of two dots, but on sequences of any two characters as.
is the regular expression operator that matches any single character. To split on two dots, you'd needawk -F '[.][.]'
orawk -F '\\.\\.'
.With
awk -F '\\\\'
, a literal\\\\
is passed by the shell toawk
,awk
expands each of those two\\
to\
, soFS
becomes\\
, which is treated as a regular expression.\
is also special in the regular expression syntax and is used to remove the special meaning of a character as a regex operator this time. So again, that is splitting on backslash characters, though this time, as a regular expression.So, in practice, to split on
\
, all of these (in Bourne-like shells) will work:I would advise to use single quotes as they are the most straightforward and least surprising kind of quotes. So here, to split on backslash portably:
awk -F '\\'
.You can also do things like:
Or
or
or:
(that one avoiding the extra backslash expansion performed by
awk
, so need only one backslash).