Early shells had only a single data type: strings. But it is common to manipulate lists of strings, typically when passing multiple file names as arguments to a program. Another common use case for splitting is when a command outputs a list of results: the command's output is a string, but the desired data is a list of strings. To store a list of file names in a variable, you would put spaces between them. Then a shell script like this
files="foo bar qux"
myprogram $files
called myprogram
with three arguments, as the shell split the string $files
into words. At the time, spaces in file names were either forbidden or widely considered Not Done.
The Korn shell introduced arrays: you could store a list of strings in a variable. The Korn shell remained compatible with the then-established Bourne shell, so bare variable expansions kept undergoing word splitting, and using arrays required some syntactic overhead. You would write the snippet above
files=(foo bar qux)
myprogram "${files[@]}"
Zsh had arrays from the start, and its author opted for a saner language design at the expense of backward compatibility. In zsh (under the default expansion rules) $var
does not perfom word splitting; if you want to store a list of words in a variable, you are meant to use an array; and if you really want word splitting, you can write $=var
.
files=(foo bar qux)
myprogram $files
These days, spaces in file names are something you need to cope with, both because many users expect them to work and because many scripts are executed in security-sensitive contexts where an attacker may be in control of file names. So automatic word splitting is often a nuisance; hence my general advice to always use double quotes, i.e. write "$foo"
, unless you understand why you need word splitting in a particular use case. (Note that bare variable expansions undergo globbing as well.)
The answers you found on Stack Exchange are right and this tutorial is wrong. You can experiment by yourself or look it up in the standard. An unset IFS
is equivalent to setting it to the default value of space-tab-newline, while an empty IFS
effectively turns off field splitting.
You can consult Sven Mascheck's page on IFS about historical implementations. A few historical shells didn't like unset IFS
, and a very old version of ksh treated it like an empty IFS
but all modern shells and most old shells treat an unset IFS
like the default value.
You should not start your script with IFS=
unless you want to turn off field splitting (which can be a reasonable decision — but note that you still need to put double quotes around substitutions to avoid globbing, unless you turn that off with set -f
too). To reset the default value, use unset IFS
. It's debatable whether this is useful at the beginning of a script; there are plenty of other bad things such as a dodgy PATH
that the caller can do to make your script go wrong.
This tutorial also advises to reset PATH
. This is usually bad advice. In most cases, you cannot predict what the correct search path is, but the user knows. How do you know whether /usr/local/bin
or /home/bob/bin
contains bug-fixed versions of utilities on an ancient unix where the ones in /usr/bin
are buggy? Do you really want to embed all the logic to figure out whether to put /usr/xpg6/bin
ahead of /bin
? At what position you want /usr/gnu/bin
? Do not reset PATH unless your script targets a specific system.
I haven't read this tutorial, but I did check one thing: it doesn't tell you right from the start to always put double quotes around variable substitutions and command substitutions. So I do not think this tutorial is a good one.
Best Answer
Let's break this down into pieces.
This code runs the command
:
with some arguments. The command:
does nothing and ignores its arguments. Therefore the whole command line does nothing, except whatever side effects happen in the arguments.The syntax
${parameter_name:=value}
exists in all non-antique Bourne-style shells, including ash, bash, ksh and zsh. It sets the parameter to a default if necessary. It is equivalent toIn other words, if
parameter_name
is not set or is set to an empty value, then set it to the indicated value; and then run the command, using the new parameter value. There is a variant,${parameter_name=value}
, which leaves the parameter empty if it was empty, only using the indicated value if the parameter was unset.You'll find this syntax documented under “parameter expansion” in the POSIX spec, and the dash, bash, ksh and zsh manuals.
There are variations on this syntax, in particular
${parameter_name:-value}
which let you use a default value for this expansion only, without assigning to the parameter.In summary,
: ${parameter_name:=value}
is a concise way of writing