Bash – Why Does Bash Add Single Quotes to Unquoted Failed Pathname Expansions?

bashquotingshelltrace

I was exploring the tracing of commands using set -x (set +x to unset) in bash:

Print a trace of simple commands, for commands, case commands, select
commands, and arithmetic for commands and their arguments or
associated word lists after they are expanded and before they are
executed. The value of the PS4 variable is expanded and the resultant
value is printed before the command and its expanded arguments.

Now consider the following, tracing the use of the the bash builtin echo \[-neE\] \[arg …\] command with and without quotes:

# set -x        # what I typed
# echo 'love'   # ...
+ echo love     <--(1) the trace
love            # the output

# echo love?    # note the input contains no quote whatsoever
+ echo 'love?'  <--(2) note the trace contains quotes after returning word
love?           # i.e. failed to find any file 

# echo 'love?'  # note the input contains single quotes
+ echo 'love?'  <--(3) traces like (2)
love?

# touch loveu   # we create a file that matches the love? pattern
+ touch loveu

# echo love?    # of course now, the pattern matches the created file now
+ echo loveu    <--(4) indeed it finds it and expands to name
loveu           # the name is echoed

So ? is indeed interpreted in this case as a special character used for pattern matching one single character in pathname expansion. Sure enough, once a file matching the pattern was created in the current directory, the match occurred and the name of the file was printed. Of course this behavior is documented:

If no matching file names are found, and the shell option nullglob is
disabled, the word is left unchanged.

But the thing is that the word in (2) is unquoted love? not 'love?'. The trace shows the state before command execution but after expansion, and as we're seeing there is pathname expansion because of ? and there were no matches in the first case(2) we used the special character. So the single quotes appear in that case, just as when we use single quotes(3) ourselves with the same string? Whereas in the other cases there was either a literal or the match was found and accordingly "replaced" the pattern in the command. This seems to be what is meant in the manual section on quote removal right after expansion:

After the preceding expansions, all unquoted occurrences of the
characters ‘\’, ‘'’, and ‘"’ that did not result from one of the
above expansions are removed. (my italics)

So here(2) we have unquoted occurrences of ' which result from the prior expansion. I did not put them there; bash did, and now they're not removed – and we're just before the execution of the command.


Similar illustration with for

Consider this list used in a for name [ [in [words …] ] ; ] do commands; done loop1 , with no matching file:

# for i in love love? 'love?'; do echo $i; done
+ for i in love 'love?' ''\''love?'\'''
+ echo love
love
+ for i in love 'love?' ''\''love?'\'''
+ echo 'love?'
love?
+ for i in love 'love?' ''\''love?'\'''
+ echo 'love?'
love?

So the echo command behavior is quite the same but in the case of the items in the for construct, it seems like it's trying to… escape itself quoting my quotes?? I'm uncertain…


Questions

  • Why is an unquoted failed pathname expansion pattern denoted with single quotes in the context(2); expansion is completed anyway and we're going to execute? Again, we've completed expansion already and the pattern failed – nothing should have to expand anymore. I guess what I'm asking is why do we care at this point – the point we're at is just before 3.7.2-4 in the bash manual. Why isn't this left "as is" and expansion is simply turned off for command execution i.e. something like set -f?
  • (What is the for loop doing with my single quoted item in the list?)

1. When using such a word list construct with for, it's really a list of items and the values are for convenience really as I find t="0"; for i in 0 0 0 0; do let t++; echo "yes, this is really $t times"; done quite convincing.

Best Answer

When instructed to echo commands as they are executed ("execution trace"), both bash and ksh add single quotes around any word with meta-characters (*, ?, ;, etc.) in it.

The meta-characters could have gotten into the word in a variety of ways. The word (or part of it) could have been quoted with single or double quotes, the characters could have been escaped with a \, or they remained as the result of a failed filename matching attempt. In all cases, the execution trace will contain single-quoted words, for example:

$ set -x
$ echo foo\;bar
+ echo 'foo;bar'

This is just an artifact of the way the shells implement the execution trace; it doesn't alter the way the arguments are ultimately passed to the command. The quotes are added, printed, and discarded. Here is the relevant part of the bash source code, print_cmd.c:

/* A function to print the words of a simple command when set -x is on. */
void
xtrace_print_word_list (list, xtflags)
...
{
  ...
  for (w = list; w; w = w->next)
    {
      t = w->word->word;
      ...
      else if (sh_contains_shell_metas (t))
        {
          x = sh_single_quote (t);
          fprintf (xtrace_fp, "%s%s", x, w->next ? " " : "");
          free (x);
        }

As to why the authors chose to do this, the code there doesn't say. But here's some similar code in variables.c, and it comes with a comment:

/* Print the value cell of VAR, a shell variable.  Do not print
   the name, nor leading/trailing newline.  If QUOTE is non-zero,
   and the value contains shell metacharacters, quote the value
   in such a way that it can be read back in. */
void
print_var_value (var, quote)
...
{
  ...
  else if (quote && sh_contains_shell_metas (value_cell (var)))
    {
      t = sh_single_quote (value_cell (var));
      printf ("%s", t);
      free (t);
    }

So possibly it's done so that it's easier to copy the command lines from the output of the execution trace and run them again.

Related Question