When and Why Double-Quoting is Necessary in Shell Scripts

bashquotingshellshell-scriptzsh

The old advice used to be to double-quote any expression involving a $VARIABLE, at least if one wanted it to be interpreted by the shell as one single item, otherwise, any spaces in the content of $VARIABLE would throw off the shell.

I understand, however, that in more recent versions of shells, double-quoting is no longer always needed (at least for the purpose described above). For instance, in bash:

% FOO='bar baz'
% [ $FOO = 'bar baz' ] && echo OK
bash: [: too many arguments
% [[ $FOO = 'bar baz' ]] && echo OK
OK
% touch 'bar baz'
% ls $FOO
ls: cannot access bar: No such file or directory
ls: cannot access baz: No such file or directory

In zsh, on the other hand, the same three commands succeed. Therefore, based on this experiment, it seems that, in bash, one can omit the double quotes inside [[ ... ]], but not inside [ ... ] nor in command-line arguments, whereas, in zsh, the double quotes may be omitted in all these cases.

But inferring general rules from anecdotal examples like the above is a chancy proposition. It would be nice to see a summary of when double-quoting is necessary. I'm primarily interested in zsh, bash, and /bin/sh.

Best Answer

First, separate zsh from the rest. It's not a matter of old vs modern shells: zsh behaves differently. The zsh designers decided to make it incompatible with traditional shells (Bourne, ksh, bash), but easier to use.

Second, it is far easier to use double quotes all the time than to remember when they are needed. They are needed most of the time, so you'll need to learn when they aren't needed, not when they are needed.

In a nutshell, double quotes are necessary wherever a list of words or a pattern is expected. They are optional in contexts where a raw string is expected by the parser.

What happens without quotes

Note that without double quotes, two things happen.

  1. First, the result of the expansion (the value of the variable for a parameter substitution like ${foo}, or the output of the command for a command substitution like $(foo)) is split into words wherever it contains whitespace.
    More precisely, the result of the expansion is split at each character that appears in the value of the IFS variable (separator character). If a sequence of separator characters contains whitespace (space, tab or newline), the whitespace is counts as a single character; leading, trailing or repeated non-whitespace separators lead to empty fields. For example, with IFS=" :", :one::two : three: :four  produces empty fields before one, between one and two, and (a single one) between three and four.
  2. Each field that results from splitting is interpreted as a glob (a wildcard pattern) if it contains one of the characters \[*?. If that pattern matches one or more file names, the pattern is replaced by the list of matching file names.

An unquoted variable expansion $foo is colloquially known as the “split+glob operator”, in contrast with "$foo" which just takes the value of the variable foo. The same goes for command substitution: "$(foo)" is a command substitution, $(foo) is a command substitution followed by split+glob.

Where you can omit the double quotes

Here are all the cases I can think of in a Bourne-style shell where you can write a variable or command substitution without double quotes, and the value is interpreted literally.

  • On the right-hand side of an assignment.

    var=$stuff
    a_single_star=*
    

    Note that you do need the double quotes after export, because it's an ordinary builtin, not a keyword. This is only true in some shells such as dash, zsh (in sh emulation), yash or posh; bash and ksh both treat export specially.

    export VAR="$stuff"
    
  • In a case statement.

    case $var in …
    

    Note that you do need double quotes in a case pattern. Word splitting doesn't happen in a case pattern, but an unquoted variable is interpreted as a pattern whereas a quoted variable is interpreted as a literal string.

    a_star='a*'
    case $var in
      "$a_star") echo "'$var' is the two characters a, *";;
       $a_star) echo "'$var' begins with a";;
    esac
    
  • Within double brackets. Double brackets are shell special syntax.

    [[ -e $filename ]]
    

    Except that you do need double quotes where a pattern or regular expression is expected: on the right-hand side of = or == or != or =~.

    a_star='a*'
    if [[ $var == "$a_star" ]]; then echo "'$var' is the two characters a, *"
    elif [[ $var == $a_star ]]; then echo "'$var' begins with a"
    fi
    

    You do need double quotes as usual within single brackets [ … ] because they are ordinary shell syntax (it's a command that happens to be called [). See Single or double brackets

  • In a redirection in non-interactive POSIX shells (not bash, nor ksh88).

    echo "hello world" >$filename
    

    Some shells, when interactive, do treat the value of the variable as a wildcard pattern. POSIX prohibits that behaviour in non-interactive shells, but a few shells including bash (except in POSIX mode) and ksh88 (including when found as the (supposedly) POSIX sh of some commercial Unices like Solaris) still do it there (bash does also attempt splitting and the redirection fails unless that split+globbing results in exactly one word), which is why it's better to quote targets of redirections in a sh script in case you want to convert it to a bash script some day, or run it on a system where sh is non-compliant on that point, or it may be sourced from interactive shells.

  • Inside an arithmetic expression. In fact, you need to leave the quotes out in order for a variable to be parsed as an arithmetic expression.

    expr=2*2
    echo "$(($expr))"
    

    However, you do need the quotes around the arithmetic expansion as they are subject to word splitting in most shells as POSIX requires (!?).

  • In an associative array subscript.

    typeset -A a
    i='foo bar*qux'
    a[foo\ bar\*qux]=hello
    echo "${a[$i]}"
    

An unquoted variable and command substitution can be useful in some rare circumstances:

  • When the variable value or command output consists of a list of glob patterns and you want to expand these patterns to the list of matching files.
  • When you know that the value doesn't contain any wildcard character, that $IFS was not modified and you want to split it at whitespace characters.
  • When you want to split a value at a certain character: disable globbing with set -f, set IFS to the separator character (or leave it alone to split at whitespace), then do the expansion.

Zsh

In zsh, you can omit the double quotes most of the times, with a few exceptions.

  • $var never expands to multiple words, however it expands to the empty list (as opposed to a list containing a single, empty word) if the value of var is the empty string. Contrast:

    var=
    print -l $var foo        # prints just foo
    print -l "$var" foo      # prints an empty line, then foo
    

    Similarly, "${array[@]}" expands to all the elements of the array, while $array only expands to the non-empty elements.

  • The @ parameter expansion flag sometimes requires double quotes around the whole substitution: "${(@)foo}".

  • Command substitution undergoes field splitting if unquoted: echo $(echo 'a'; echo '*') prints a * (with a single space) whereas echo "$(echo 'a'; echo '*')" prints the unmodified two-line string. Use "$(somecommand)" to get the output of the command in a single word, sans final newlines. Use "${$(somecommand; echo _)%?}" to get the exact output of the command including final newlines. Use "${(@f)$(somecommand)}" to get an array of lines from the command's output.

Related Question