Security Implications of Using Unsanitized Data in Shell Arithmetic

arithmeticSecurityshell

In a comment to a recent question, Stéphane Chazelas
mentions that there are security implications to double parentheses arithmetic such as:

x=$((1-$x))

on most shells.

My Google skills seem to be rusty and I can't find anything. What are the security implications of double parentheses arithmetic?

Best Answer

The problem is in cases where the content of $x has not been sanitized and contains data that could potentially be under the control of an attacker in cases that shell code may end up being used in a privilege escalation context (for instance a script invoked by a setuid application, a sudoers script or used to process off-the-network data (CGI, DHCP hook...) directly or indirectly).

If:

x='(PATH=2)'

Then:

x=$((1-$x)))

has the side effect of setting PATH to 2 (a relative path that could very well be under control of the attacker). You can replace PATH with LD_LIBRARY_PATH or IFS... The same happens with x=$((1-x)) in bash, zsh or ksh (not dash nor yash which only accept numerical constants in variables there).

Note that:

x=$((1-$x))

won't work properly for negative values of $x in some shells that implement the (optional as per POSIX) -- (decrement) operator (as with x=-1, that means asking the shell to evaluate the 1--1 arithmetic expression). "$((1-x))" doesn't have the problem as x is expanded as part of (not before) the arithmetic evaluation.

In bash, zsh and ksh (not dash or yash), if x is:

x='a[0$(uname>&2)]'

Then the expansion of $((1-$x)) or $((1-x)) causes that uname command to be executed (for zsh, a needs to be an array variable, but one can use psvar for instance for that).

In summary, one shouldn't use uninitialised or non-sanitized external data in arithmetic expressions in shells.

Note that arithmetic evaluation can be done by $((...)) (aka $[...] in bash or zsh) but also depending on the shell in the let, [/test, declare/typeset/export..., return, break, continue, exit, printf, print builtins, array indices, ((..)) and [[...]] constructs to name a few).

Because it applies to array indices in ksh/zsh/bash, it also applies to all builtins that take variable names as argument ([/test with -v, read, unset, export/typeset/readonly, print/printf with -v, getopts...).

The fact that operands of numeric test operators are treated as arithmetic expressions with [[...]] and not with the [/test builtin is one reason why in bash or zsh, it's often preferable to use the latter.

Compare:

$ a='x[1$(uname>&2)]' bash -c '[ "$a" -eq "$b" ]'
bash: line 0: [: x[1$(uname>&2)]: integer expression expected</pre>

(safe) with:

$ a='x[1$(uname>&2)]' bash -c '[[ "$a" -eq "$b" ]]'
Linux

(uname was executed).

(in ksh, both [ and [[ ... ]] have the problem)

To check that a variable contains a literal decimal integer number, you can use POSIXly:

case $var in
  ("" | - | *[!0123456789-]* | ?*-*) echo >&2 not a valid number; exit 1;;
esac

Beware that [0-9] and [[:digit:]] in some locales matches more than 0123456789 so should be avoid for any form of input validation/sanitisation.

Also remember that numbers with leading zeros are treated as octal in some contexts (010 is sometimes 10, sometimes 8) and beware that the check above will let through numbers that are potentially bigger than the maximum integer supported by your system (or whatever application you will use that integer in; bash for instance treats 18446744073709551616 as 0 as that's 2⁶⁴). So you might want to add extra checks in that case statement above like:

(0?* | -0?*)
  echo >&2 'Only decimal numbers without leading 0 accepted'; exit 1;;
(-??????????* | [!-]?????????*)
  echo >&2 'Only numbers from -999999999 to 999999999 supported'; exit 1;;

Examples:

$ export 'x=psvar[0$(uname>&2)]'
$ ksh93 -c 'echo "$((x))"'
Linux
ksh93: psvar: parameter not set
$ ksh93 -c '[ x -lt 2 ]'
Linux
ksh93: [: psvar: parameter not set
$ bash -c 'echo "$((x))"'
Linux
0
$ bash -c '[[ $x -lt 2 ]]'
Linux
$ bash -c 'typeset -i a; export a="$x"'
Linux
$ bash -c 'typeset -a a=([x]=1)'
Linux
$ bash -c '[ -v "$x" ]'
Linux
$ bash -c 'read "$x"' < /dev/null
Linux
$ env psvar= bash -c 'unset "$x"'
Linux
$ mksh -c '[[ $x -lt 2 ]]'
Linux
$ zsh -c 'echo "$((x))"'
Linux
0
$ zsh -c 'printf %d $x'
Linux
0
$ zsh -c 'integer x'
Linux
$ zsh -c 'exit $x'
Linux

Related Solutions

Why Bash Shell Doesn’t Warn About Arithmetic Overflow

So between 2^63 and 2^64-1, you get negative integers showing you how far off from ULONG_MAX you are.

No. How do you figure that? By your own example, the max is:

> max=$((2**63 - 1)); echo $max
9223372036854775807

If "overflow" meant "you get negative integers showing you how far off from ULONG_MAX you are", then if we add one to that, shouldn't we get -1? But instead:

> echo $(($max + 1))
-9223372036854775808

Perhaps you mean this is a number you can add to $max to get a negative difference, since:

> echo $(($max + 1 + $max))
-1

But this does not in fact continue to hold true:

> echo $(($max + 2 + $max))
0

This is because the system uses two's complement to implement signed integers.¹ The value resulting from an overflow is NOT an attempt to provide you with a difference, a negative difference, etc. It is literally the result of truncating a value to a limited number of bits, then having it interpreted as a two's complement signed integer. For example, the reason $(($max + 1 + $max)) comes out as -1 is because the highest value in two's complement is all bits set except the highest bit (which indicates negative); adding these together basically means carrying all the bits to the left so you end up with (if the size were 16-bits, and not 64):

11111111 11111110

The high (sign) bit is now set because it carried over in the addition. If you add one more (00000000 00000001) to that, you then have all bits set, which in two's complement is -1.

I think that partially answers the second half of your first question -- "Why are the negative integers...exposed to the end user?". First, because that is the correct value according to the rules of 64-bit two's complement numbers. This is the conventional practice of most (other) general purpose high level programming languages (I cannot think of one that does not do this), so bash is adhering to convention. Which is also the answer to the first part of the first question -- "What's the rationale?": this is the norm in the specification of programming languages.

WRT the 2nd question, I have not heard of systems which interactively change ULONG_MAX.

If someone arbitrarily changes the value of the unsigned integer maximum in limits.h, then recompiles bash, what can we expect will happen?

It would not make any difference to how the arithmetic comes out, because this is not an arbitrary value that is used to configure the system -- it's a convenience value that stores an immutable constant reflecting the hardware. By analogy, you could redefine c to be 55 mph, but the speed of light will still be 186,000 miles per second. c is not a number used to configure the universe -- it's a deduction about the nature of the universe.

ULONG_MAX is exactly the same. It is deduced/calculated based on the nature of N-bit numbers. Changing it in limits.h would be a very bad idea if that constant is used somewhere assuming it is supposed to represent the reality of the system.

And you cannot change the reality imposed by your hardware.

^{1. I don't think that this (the means of integer representation) is actually guaranteed by bash, since it depends on the underlying C library and standard C does not guarantee that. However, this is what is used on most normal modern computers.}

Shell – Does /usr/sbin/nologin as a login shell serve a security purpose

If you take a look at the nologin man page you'll see the following description.

excerpt

nologin displays a message that an account is not available and exits non-zero. It is intended as a replacement shell field to deny login access to an account.

If the file /etc/nologin.txt exists, nologin displays its contents to the user instead of the default message.

The exit code returned by nologin is always 1.

So the actual intent of nologin is just so that when a user attempts to login with an account that makes use of it in the /etc/passwd is so that they're presented with a user friendly message, and that any scripts/commands that attempt to make use of this login receive the exit code of 1.

Security

With respect to security, you'll typically see either /sbin/nologin or sometimes /bin/false, among other things in that field. They both serve the same purpose, but /sbin/nologin is probably the preferred method. In any case they're limiting direct access to a shell as this particular user account.

Why is this considered valuable with respect to security?

The "why" is hard to fully describe, but the value in limiting a user's account in this manner, is that it thwarts direct access via the login application when you attempt to gain access using said user account.

Using either nologin or /bin/false accomplishes this. Limiting your system's attack surface is a common technique in the security world, whether disabling services on specific ports, or limiting the nature of the logins on one's systems.

Still there are other rationalizations for using nologin. For example, scp will no longer work with a user account that does not designate an actual shell, as described in this ServerFault Q&A titled: What is the difference between /sbin/nologin and /bin/false?.