Shell Arithmetic – Expansion with Quotes

arithmeticshell

In Bash and Dash, using quotes in an Arithmetic Expansion is illegal:

$ bash -c 'x=123;echo $(("$x"))'
bash: "123": syntax error: operand expected (error token is ""123"")
$ dash -c 'x=123;echo $(("$x"))'
dash: 1: arithmetic expression: expecting primary: ""123""

Bash gives the same error when invoked as sh. Ksh and FreeBSD's Bourne Shell don't mind it, though:

$ ksh -c 'x=123;echo $(("$x"))'
123
$ sh -c 'x=123;echo $(("$x"))'
123

According to the Bash Reference Manual:

The expression is treated as if it were within double quotes, but a double quote inside the parentheses is not treated specially. All tokens … undergo … quote removal.

(which is essentially the same as POSIX says.)

Finally, there's a distinction here in how Bash handles $(( )) compared to other arithmetic contexts like (( )) (as in conditional expressions, for example). The latter is fine with quotes.

Either I don't understand what quote removal means here, or this is a bug in some of these shell implementations. If it's the former, what does "quote removal" actually mean? Or, is it just a bug?

Best Answer

I'm torn between whether this is a poor implementation or poor documentation. Bash says this about quote removal:

Quote Removal

After the preceding expansions, all unquoted occurrences of the characters \, ', and " that did not result from one of the above expansions are removed.

I think the key might be "all unquoted occurrences" in that paragraph. Everything inside $(( )) is treated as if it's in double quotes, per the documentation. Those characters are all quoted if they're inside the parens, making quote removal essentially a noop. For example, note how the other "removed" chars are treated (also note how trailing space is preserved, due to how quoted strings are parsed):

$ echo $(( '5' ))
bash: '5' : syntax error: operand expected (error token is "'5' ")
$ echo $(( \ ))
bash: \ : syntax error: operand expected (error token is "\ ")

Skimming the source code, the quotes do need to be balanced, as a result of the code which scans to identify if $(( )) is math or a nested legacy subexpression. When the string is identified as an arithmetic expression, it's then parsed as if it's double quoted - which means all the chars inside are considered quoted before quote removal happens.

Personally, this is part of why I prefer ksh - especially for math. It treats the single-quoted 5 above as a C string which evaluates as 53, for example. man ascii to see why that makes sense. :)

Related Question