tar -czf backup.tgz "$exclude1" "$exclude2" ${exclude3+"$exclude3"} 2>&1
${exclude3+"$exclude3"}
expands to nothing, if $exclude3
is unset, and to "$exclude3"
, if it is set.
(and similarly for the other variables that are potentially unset.)
Note that there is a difference between an unset variable and a variable that is set to the empty string, so you should use
unset exclude3
instead of
exclude3=''
in
./input $(cmd)
Because, $(cmd)
is unquoted, that's a split+glob operator. The shell retrieves the output of cmd
, removes all the trailing newline characters, then splits that based on the value of the $IFS
special parameter, and then performs filename generation (for instance turns *.txt
into the list of non-hidden txt files in the current directory) on the resulting words (that latter part not with zsh
) and in the case of ksh
also performs brace expansion (turns a{b,c}
into ab
and ac
for instance).
The default value of $IFS
contains the SPC, TAB and NL characters (also NUL in zsh
, other shells either remove the NULs or choke on them). Those (not NUL) also happen to be IFS-whitespace characters¹, which are treated specially when it comes to IFS-splitting.
If the output of cmd
is " a b\nc \n"
, that split+glob operator will generate a "a"
, "b"
and "c"
arguments to ./input
. With IFS-white-space characters, it's impossible for split+glob
to generate an empty argument because sequences of one or more IFS-whitespace characters are treated as one delimiter. To generate an empty argument, you'd need to choose a separator that is not an IFS-whitespace character. Actually, any non-whitespace character will do (best to also avoid multi-byte characters which are not supported by all shells here).
So for instance if you do:
IFS=: # split on ":" which is not an IFS-whitespace character
set -o noglob # disable globbing (also brace expansion in ksh)
./input $(cmd)
And if cmd
outputs a::b\n
, then that split+glob operator will result in "a"
, ""
and "b"
arguments (note that the "
s are not part of the value, I'm just using them here to help show the values).
With a:b:\n
, depending on the shell, that will result in "a"
and "b"
or "a"
, "b"
and ""
. You can make it consistent across all shells with
./input $(cmd)""
(which also means that for an empty output of cmd
(or an output consisting only of newline characters), ./input
will receive one empty argument as opposed to no argument at all).
Example:
cmd() {
printf 'a b:: c\n'
}
input() {
printf 'I got %d arguments:\n' "$#"
[ "$#" -eq 0 ] || printf ' - <%s>\n' "$@"
}
IFS=:
set -o noglob
input $(cmd)
gives:
I got 3 arguments:
- <a b>
- <>
- < c>
Also note that when you do:
./input ""
Those "
are part of the shell syntax, they are shell quoting operators. Those "
characters are not passed to input
.
¹ IFS whitespace characters, per POSIX being the characters classified as [:space:]
in the locale and that happen to be in $IFS
though in ksh88 (on which the POSIX specification is based) and in most shells, that's still limited to SPC, TAB and NL. The only POSIX compliant shell in that regard I found was yash
. ksh93
and bash
(since 5.0) also include other whitespace (such as CR, FF, VT...), but limited to the single-byte ones (beware on some systems like Solaris, that includes the non-breaking-space which is single byte in some locales)
Best Answer
The POSIX Base Definitions has a section on "Utility Conventions" which applies to the POSIX base utilities.
The standard
getopts
utility and thegetopt()
system interface ("C function") adheres to the guidelines (further down on the page linked to above) when parsing the command line in a shell script or C program. Specifically, forgetopts
(as an example):What this basically says is that options shall come first, and then operands (your "command or file").
Doing it any other way would render using
getopts
orgetopt()
impossible, and would additionally likely confuse users used to the POSIX way of specifying options and operands for a command.Note that the abovementioned standard only applies to POSIX utilities, but as such it sets a precedence for Unix utilities in general. Non-standard Unix utilities can choose to follow or to break this, obviously.
For example, the GNU coreutils, even though they implement the standard utilities, allows for things like
if the
POSIXLY_CORRECT
environment variable is not set, whereas the BSD version of the same utilities do not.This has the consequence that the following works as expected (if you expect POSIX behaviour, that is) on a BSD system:
But on a GNU coreutils system, you get
However:
and
will do the "right" thing on a GNU system too.