Bash – Is “Arithmetic Expansion” the expected action on vars inside `[[` tests

ashbashcommand linedashkshksh93lkshmkshshellzsh

This script:

a=2
[[ "$a" -eq 2 ]] && echo yes1
[[  $a  -eq 2 ]] && echo yes2
[[   a  -eq 2 ]] && echo yes3
[[  "a" -eq 2 ]] && echo yes4

Does not work in dash (and some others that do not have [[ ).
Fails the last two "yes tests" in ash (BusyBox ash). As I expected.

But, quite UN expectly:
This happens when the arithmetic » -eq « is being used (and some other).
Prints all the four yes in ksh, ksh93, lksh, mksh, bash and zsh.

Q's

Is this the natural side effect of "Arithmetic Expansion" being applied to labels like (a) ?.
(nothing to see, move along ?)

Should (a) still be expanded as a variable when quoted: "a"?

Is this documented somewhere I haven't look into yet?

Or, Is a simple (very old) bug?

Edit

To really get you thinking, check this:

a=2; b=2

[[ "$a" -eq "$b"                 ]] && echo yes1
[[  $a  -eq "$b"                 ]] && echo yes2
[[   a  -eq "$b"                 ]] && echo yes3
[[  "a" -eq "$b"                 ]] && echo yes4
[[ "$a" -eq "$b" && "$a" == "$b" ]] && echo new1
[[  $a  -eq "$b" &&  $a  == "$b" ]] && echo new2
[[   a  -eq "$b" &&   a  == "$b" ]] && echo new3
[[  "a" -eq "$b" &&  "a" == "$b" ]] && echo new4

All yes work, only new1 and new2 do. The var a is not being expanded in new3 & 4.

That means that shells switch mode (Arithmetic — String) in the middle of the evaluation.

This doesn't seem so obvious to me. At least, is not simple, IMO.

Note that ash does it in a different way. And also report the error.

Conclusion:

I do not know the correct answer. But I suspect (looking at so many shells doing it exactly the same way) that it is a very well known way (to developers) in which this tests should work. It should be up to us to understand the fact and adapt.

Seems that for numeric -eq the only possible 'Conditional Expression' to apply is:

exp1 -eq exp2
          true if exp1 is numerically equal to exp2.

Which makes both sides of the -eq "Arithmetic expressions"

While for the == the only possible 'Conditional Expression' to apply is:

string == pattern
          true if string matches pattern. ....

Which makes the left hand side an string (no expansion?) and the right hand side a pattern (which really do not apply to the test I presented as all were quoted on the right side to avoid this complexity. So, right hand side defaults (being quoted) to an string also (no arithmetic expansion here).

But maybe I am more confused than I though 🙂

What do you say?

Best Answer

From man ksh:

An arithmetic expression uses the same syntax, precedence, and associativity of expression as the C language. All the C language operators that apply to floating point quantities can be used... Variables can be referenced by name within an arithmetic expression without using the parameter expansion syntax. When a variable is referenced, its value is evaluated as an arithmetic expression...

A conditional expression is used with the [[ compound command to test attributes of files and to compare strings. Field splitting and file name generation are not performed on the words between [[ and ]]. Each expression can be constructed from one or more of the following unary or binary expressions...

The following obsolete arithmetic comparisons are also permitted:

exp1-eqexp2

True, if exp1 is equal to exp2.

exp1-neexp2

True, if exp1 is not equal to exp2.

exp1-ltexp2

True, if exp1 is less than exp2.

exp1-gtexp2

True, if exp1 is greater than exp2.

exp1-leexp2

True, if exp1 is less than or equal to exp2.

exp1-geexp2

True, if exp1 is greater than or equal to exp2.

The documentation there is consistent where references to arithmetic expressions are concerned, and (apparently carefully) avoids any self-contradictions surrounding the definition of the [[ compound command ]] pertaining to string comparison by explicitly also permitting some obsolete arithmetic comparisons in the same context.

From man bash:

[[expression]]

Return a status of 0 or 1 depending on the evaluation of the condi‐ tional expression expression. Expressions are composed of the primaries described below... Word splitting and pathname expansion are not performed on the words between the [[ and ]]; ~tilde expansion, ${parameter} and $variable expansion, $((arithmetic expansion)), $(command substitution), <(process substitution), and "\'quote removal are performed. Conditional operators such as -f must be unquoted to be recognized as primaries...

A variable may be assigned to by a statement of the form:

name=[value]

If value is not given, the variable is assigned the null string. All values undergo ~tilde expansion, ${parameter} and $variable expansion, $(command substitution), $((arithmetic expansion)), and "\'quote removal... If the variable has its integer attribute set, then value is evaluated as an $((arithmetic expression)) even if the $((...)) expansion is not used...

The shell allows arithmetic expressions to be evaluated, under certain circumstances... Evaluation is done in fixed-width integers with no check for overflow, though division by 0 is trapped and flagged as an error. The operators and their precedence, associativity, and values are the same as in the C language.

Shell variables are allowed as operands; parameter expansion is performed before the expression is evaluated. Within an expression, shell variables may also be referenced by name without using the parameter expansion syntax. The value of a variable is evaluated as an arithmetic expression when it is referenced, or when a variable which has been given the integer attribute using declare -i is assigned a value... A shell variable need not have its integer attribute turned on to be used in an expression.

Conditional expressions are used by the [[ compound command and the test and [ builtin commands to test file attributes and perform string and arithmetic comparisons...

arg1OParg2

OP is one of -eq, -ne, -lt, -le, -gt, or -ge. These arithmetic binary operators return true if arg1 is equal to, not equal to, less than, less than or equal to, greater than, or greater than or equal to arg2, respectively. arg1 and arg2 may be positive or negative integers.

I think, given all that context, the behavior you observe stands to reason, even if it is not explicitly spelled out as a possibility in the documentation there. The docs do point to special treatment of parameters with integer attributes, and clearly denote a difference between a compound command and a builtin command.

The [[ comparison is syntax in the same sense that the assignment name=value is syntax or casewordin... is syntax. test and [, however, are not as such, and are rather separate procedures which take arguments. As I think, the best way to really get a feel for the differences is to have a look at shell error output:

set   '[[ \\ -eq 0 ]]' '[ \\ -eq 0 ]'
for    sh in   bash ksh
do     for     exp
       do     "$sh" -c  "$1||$2"
               set "$2" "$1"
done;  done

bash: [[: \: syntax error: operand expected (error token is "\")
bash: line 0: [: \: integer expression expected
bash: line 0: [: \: integer expression expected
bash: [[: \: syntax error: operand expected (error token is "\")
ksh: \: arithmetic syntax error
ksh: [: \: arithmetic syntax error
ksh: \: arithmetic syntax error

The two shells handle the exceptions differently, but the underlying reasons for the differences in both cases for both shells are very similar.

bash directly calls the [[ \\ case a syntax error - in the same way it might for a redirect from a non-existent file, for example - though it goes on from that point (as I believe, incorrectly) to evaluate the other side of the || or expression. bash does give the [[ expression a command name in error output, but note that it doesn't bother discussing the line number on which you call it as it does for the [ command. bash's [ complains about not receiving what it expects to be an integer expression as an argument, but [[ need not complain in that way because it doesn't really take arguments, and never needs to expect anything at all when it is parsed alongside the expansions themselves.
ksh halts altogether when the [[ syntax error and doesn't bother with [ at all. It writes the same error message for both, but note that [ is assigned a command name there where [[ is just ksh. The [ is only called after the command-line has been successfully parsed and expansions have already occurred - it will do its own little getopts routine and get its own arg[0c] and the rest, but [[ is handled as underlying shell syntax once again.

I consider the bash docs slightly less clear than the ksh version in that they use the terms arg[12] rather than expression regarding integer comparisons, but I think it is done merely because [[, [, and test are all lumped together at that juncture, and the latter two do take arguments whereas the former only ever receives an expression.

In any case, where the integer comparison is not ambiguous in the syntax context, you can basically do any valid math operation mid-expression:

   m=5+5  a[m]=10
[[     m   -eq 10 ]] &&
[[     m++ -eq 10 ]] &&
[[     m-- -gt 10 ]] &&
[[ ${a[m]}  == 10 ]] &&
echo "math evals"

math evals

Related Solutions

Bash – Why does bash add single quotes to unquoted failed pathname expansions in a command before executing it

When instructed to echo commands as they are executed ("execution trace"), both bash and ksh add single quotes around any word with meta-characters (*, ?, ;, etc.) in it.

The meta-characters could have gotten into the word in a variety of ways. The word (or part of it) could have been quoted with single or double quotes, the characters could have been escaped with a \, or they remained as the result of a failed filename matching attempt. In all cases, the execution trace will contain single-quoted words, for example:

$ set -x
$ echo foo\;bar
+ echo 'foo;bar'

This is just an artifact of the way the shells implement the execution trace; it doesn't alter the way the arguments are ultimately passed to the command. The quotes are added, printed, and discarded. Here is the relevant part of the bash source code, print_cmd.c:

/* A function to print the words of a simple command when set -x is on. */
void
xtrace_print_word_list (list, xtflags)
...
{
  ...
  for (w = list; w; w = w->next)
    {
      t = w->word->word;
      ...
      else if (sh_contains_shell_metas (t))
        {
          x = sh_single_quote (t);
          fprintf (xtrace_fp, "%s%s", x, w->next ? " " : "");
          free (x);
        }

As to why the authors chose to do this, the code there doesn't say. But here's some similar code in variables.c, and it comes with a comment:

/* Print the value cell of VAR, a shell variable.  Do not print
   the name, nor leading/trailing newline.  If QUOTE is non-zero,
   and the value contains shell metacharacters, quote the value
   in such a way that it can be read back in. */
void
print_var_value (var, quote)
...
{
  ...
  else if (quote && sh_contains_shell_metas (value_cell (var)))
    {
      t = sh_single_quote (value_cell (var));
      printf ("%s", t);
      free (t);
    }

So possibly it's done so that it's easier to copy the command lines from the output of the execution trace and run them again.

Bash – How to programmatically tell if a filename matches a shell glob pattern

I don't believe that {bar,baz} is a shell glob pattern (though certainly /foo/ba[rz] is) but if you want to know if $string matches $pattern you can do:

case "$string" in 
($pattern) put your successful execution statement here;;
(*)        this is where your failure case should be   ;;
esac

You can do as many as you like:

case "$string" in
($pattern1) do something;;
($pattern2) do differently;;
(*)         still no match;;
esac

Q's