Shell – What is List Context and String Context?

shell

I have seen several times the use of "list context" and "string context".

I know and understand the use of such descriptions in perl. They apply to $ and @.

However, when used in shell descriptions:

They seem diffuse as a term that has not been defined anywhere or
at best, poorly documented.

There is no definition in POSIX for that, acording to google

Is this (from this) the gist of it ? :

In a nutshell, double quotes are necessary wherever a list of words or a pattern is expected. They are optional in contexts where a raw string is expected by the parser.

But it seems like a dificult term to use. How could we find "what the result should be" when "the result is needed" to know if it is a string or list context.

Or could it be preciselly and correctly defined?

Best Answer

There is no such concept in the standard shell language. There are no "contexts" only expansion steps.

Quotes are first identified in the tokenization which produces words. They glue words together so that abc"spaces here"xyz is one "word".

The important thing to understand is that quotes are preserved through the subsequent expansion steps, and the original quotes are distinguished from quotes that might arise out of expansions.

Parameters are expanded without regard for double quotes. Later, though, a field splitting process takes place which harkens back to the first tokenization. Once again, quotes prevent splitting and, once again, are preserved.

Pathname expansion ("globbing") takes place after this splitting. The preserved quotes prevent it: globbing operators are not recognized inside quotes.

Finally the quotes are removed by a late stage called "quote removal". Of course, only the original quotes!

POSIX does a good job of presenting the process in a way that is understandable; attempts to demystify it with extraneous concepts (that may be misleading) are only going to muddle the understanding.

People throwing around ad hoc concepts like "list context" should formalize their thinking to the point that it can provide a complete alternative specification for all of the processing, which is equivalent (produces the same results). And then, avoid mixing concepts between the parallel designs: use one explanation or the other. A "list context" or "string context" makes sense in a theory of shell expansion in which these are well defined, and the processing steps are organized around these concepts.

If I were to guess, then "list context" refers to the idea that the shell is working with a list of tokenized words such as the two-word list {foo} {abc" x "def}. The quotes are not part of the second word: its content is actually abc x def; they are semantic quotes which prevent the splitting on whitespace. Inside these quotes, we have "string context".

However, a possible implementation of these expansion steps is not to actually have quotes which are identified as the original quotes, but some sort of list data structure, so that {foo} {abc" x "def} is, say, a list of lists in which the quoted parts are identified as different kinds of nodes (and the quotes are gone). Using Lisp notation it could be:

(("foo") ;; one-element word
 ("abc" (:dq-str " x ") "def")) ;; three-element word

The nodes without a label are literal text, :dq-str is a double-quote region. Another type could be :sq-str for a single quoted item.

The expansion can walk this structure, and then do different things based on whether it's looking at a string object, a :dq-str expression or whatever. File expansion and field splitting would be suppressed within both :dq-str or :sq-str. But parameter expansion does take place within :dq-str. "Quote removal" would then correspond to a final pass which takes the pieces and catenates the strings, flattening the interior list structure and losing the type indicating symbols, resulting in:

("foo"
 "abc x def") ;; plain string list, usable as command arguments

Now here, note how in the second item we have ("abc" (:dq-str " x ") "def"). The first and last items are unwrapped: they are direct elements of the list and so we can say these are in the "list context". Whereas, the middle " x " is wrapped in a :dq-str expression, so that is "(double quoted) string context".

What "list" refers to in "list context" is anyone's guess without a clearly defined model such as this. Is it the master word list? Or a list of chunks representing one word?