Variable expansion (the standard term is parameter expansion, and it's also sometimes called variable substitution) basically means replacing the variable by its value. More precisely, it means replacing the $VARIABLE
construct (or ${VARIABLE}
or ${VARIABLE#TEXT}
or other constructs) by some other text which is built from the value of the variable. This other text is the expansion of the variable.
The expansion process goes as follows. (I only discuss the common case, some shell settings and extensions modify the behavior.)
- Take the value of the variable, which is a string. If the variable is not defined, use the empty string.
- If the construct includes a transformation, apply it. For example, if the construct is
${VARIABLE#TEXT}
, and the value of the variable begins with TEXT
, remove TEXT
from the beginning of the value.
- If the context calls for a single word (for example within double quotes, or in the right-hand side of an assignment, or inside a here document), stop here. Otherwise continue with the next steps.
- Split the value into separate words at each sequence of whitespace. (The variable
IFS
can be changed to split at characters other than whitespace.) The result is thus no longer a string, but a list of strings. This list can be empty if the value contained only whitespace.
- Treat each element of the list as a file name wildcard pattern, i.e. a glob. If the pattern matches some files, it is replaces by the list of matching file names, otherwise it is left alone.
For example, suppose that the variable foo
contains a* b* c*
and the current directory contains the files bar
, baz
and paz
. Then ${foo#??}
is expanded as follows:
- The value of the variable is the 8-character string
a* b* c*
.
#??
means strip off the first two characters, resulting in the 6-character string b* c*
(with an initial space).
- If the expansion is in a list context (i.e. not in double quotes or other similar context), continue.
- Split the string into whitespace-delimited words, resulting in a list of two-strings:
b*
and c*
.
- The string
b*
, interpreted as a pattern, matches two files: bar
and baz
. The string c*
matches no file so it is left alone. The result is a list of three strings: bar
, baz
, c*
.
For example echo ${foo#??}
prints bar baz c*
(the command echo
joins its arguments with a space in between).
For more details, see:
When instructed to echo commands as they are executed ("execution trace"), both bash
and ksh
add single quotes around any word with meta-characters (*
, ?
, ;
, etc.) in it.
The meta-characters could have gotten into the word in a variety of ways. The word (or part of it) could have been quoted with single or double quotes, the characters could have been escaped with a \
, or they remained as the result of a failed filename matching attempt. In all cases, the execution trace will contain single-quoted words, for example:
$ set -x
$ echo foo\;bar
+ echo 'foo;bar'
This is just an artifact of the way the shells implement the execution trace; it doesn't alter the way the arguments are ultimately passed to the command. The quotes are added, printed, and discarded. Here is the relevant part of the bash
source code, print_cmd.c
:
/* A function to print the words of a simple command when set -x is on. */
void
xtrace_print_word_list (list, xtflags)
...
{
...
for (w = list; w; w = w->next)
{
t = w->word->word;
...
else if (sh_contains_shell_metas (t))
{
x = sh_single_quote (t);
fprintf (xtrace_fp, "%s%s", x, w->next ? " " : "");
free (x);
}
As to why the authors chose to do this, the code there doesn't say. But here's some similar code in variables.c
, and it comes with a comment:
/* Print the value cell of VAR, a shell variable. Do not print
the name, nor leading/trailing newline. If QUOTE is non-zero,
and the value contains shell metacharacters, quote the value
in such a way that it can be read back in. */
void
print_var_value (var, quote)
...
{
...
else if (quote && sh_contains_shell_metas (value_cell (var)))
{
t = sh_single_quote (value_cell (var));
printf ("%s", t);
free (t);
}
So possibly it's done so that it's easier to copy the command lines from the output of the execution trace and run them again.
Best Answer
There is no such concept in the standard shell language. There are no "contexts" only expansion steps.
Quotes are first identified in the tokenization which produces words. They glue words together so that
abc"spaces here"xyz
is one "word".The important thing to understand is that quotes are preserved through the subsequent expansion steps, and the original quotes are distinguished from quotes that might arise out of expansions.
Parameters are expanded without regard for double quotes. Later, though, a field splitting process takes place which harkens back to the first tokenization. Once again, quotes prevent splitting and, once again, are preserved.
Pathname expansion ("globbing") takes place after this splitting. The preserved quotes prevent it: globbing operators are not recognized inside quotes.
Finally the quotes are removed by a late stage called "quote removal". Of course, only the original quotes!
POSIX does a good job of presenting the process in a way that is understandable; attempts to demystify it with extraneous concepts (that may be misleading) are only going to muddle the understanding.
People throwing around ad hoc concepts like "list context" should formalize their thinking to the point that it can provide a complete alternative specification for all of the processing, which is equivalent (produces the same results). And then, avoid mixing concepts between the parallel designs: use one explanation or the other. A "list context" or "string context" makes sense in a theory of shell expansion in which these are well defined, and the processing steps are organized around these concepts.
If I were to guess, then "list context" refers to the idea that the shell is working with a list of tokenized words such as the two-word list
{foo} {abc" x "def}
. The quotes are not part of the second word: its content is actuallyabc x def
; they are semantic quotes which prevent the splitting on whitespace. Inside these quotes, we have "string context".However, a possible implementation of these expansion steps is not to actually have quotes which are identified as the original quotes, but some sort of list data structure, so that
{foo} {abc" x "def}
is, say, a list of lists in which the quoted parts are identified as different kinds of nodes (and the quotes are gone). Using Lisp notation it could be:The nodes without a label are literal text,
:dq-str
is a double-quote region. Another type could be:sq-str
for a single quoted item.The expansion can walk this structure, and then do different things based on whether it's looking at a string object, a
:dq-str
expression or whatever. File expansion and field splitting would be suppressed within both:dq-str
or:sq-str
. But parameter expansion does take place within:dq-str
. "Quote removal" would then correspond to a final pass which takes the pieces and catenates the strings, flattening the interior list structure and losing the type indicating symbols, resulting in:Now here, note how in the second item we have
("abc" (:dq-str " x ") "def")
. The first and last items are unwrapped: they are direct elements of the list and so we can say these are in the "list context". Whereas, the middle" x "
is wrapped in a:dq-str
expression, so that is "(double quoted) string context".What "list" refers to in "list context" is anyone's guess without a clearly defined model such as this. Is it the master word list? Or a list of chunks representing one word?