Why Bash Shell Doesn’t Warn About Arithmetic Overflow

arithmeticbashshell

There are limits set for the arithmetic evaluation capabilities of the bash shell. The manual is succinct about this aspect of shell arithmetic but states:

Evaluation is done in fixed-width integers with no check for overflow,
though division by 0 is trapped and flagged as an error. The operators
and their precedence, associativity, and values are the same as in the
C language.

Which fixed-width integer this refers to is really about which data type is used (and the specifics of why this is are beyond this) but the limit value is expressed in /usr/include/limits.h in this fashion:

#  if __WORDSIZE == 64
#   define ULONG_MAX     18446744073709551615UL
#  ifdef __USE_ISOC99
#  define LLONG_MAX       9223372036854775807LL
#  define ULLONG_MAX    18446744073709551615ULL

And once you know that, you can confirm this state of fact like so:

# getconf -a | grep 'long'
LONG_BIT                           64
ULONG_MAX                          18446744073709551615

This is a 64 bits integer and this translates directly in the shell in the context of arithmetic evaluation:

# echo $(((2**63)-1)); echo $((2**63)); echo $(((2**63)+1)); echo $((2**64))
9223372036854775807        //the practical usable limit for your everyday use
-9223372036854775808       //you're that much "away" from 2^64
-9223372036854775807     
0
# echo $((9223372036854775808+9223372036854775807))
-1

So between 2⁶³ and 2⁶⁴-1, you get negative integers showing you how far off from ULONG_MAX you are¹. When the evaluation reaches that limit and overflows, by whatever order that is, you get no warning and that part of the evaluation is reset to 0 which may yield some unusual behavior with something like right-associative exponentiation for instance:

echo $((6**6**6))                      0   // 6^46656 overflows to 0
echo $((6**6**6**6))                   1   // 6^(6^46656) = 6^0 = 1
echo $((6**6**6**6**6))                6   // 6^(6(6^46656)) = 6^(6^0) = 6^1
echo $((6**6**6**6**6**6))         46656   // 6^(6^(6^(6^46656))) = 6^6
echo $((6**6**6**6**6**6**6))          0   // = 6^6^6^1 = 0
...

Using sh -c 'command' doesn't change anything so I have to assume this is normal and compliant output. Now that I think I have a basic but concrete understanding of the arithmetic range and limit and what it means in the shell for expression evaluation, I thought I could quickly peek at what data types the other software in Linux use. I used some bash sources I had to complement the input of this command:

{ shopt -s globstar; for i in /path/to/source_bash-4.2/include/**/*.h /usr/include/**/*.h; do grep -HE '\b(([UL])|(UL)|())LONG|\bFLOAT|\bDOUBLE|\bINT' $i; done; } | grep -iE 'bash.*max'

bash-4.2/include/typemax.h:#    define LLONG_MAX   TYPE_MAXIMUM(long long int)
bash-4.2/include/typemax.h:#    define ULLONG_MAX  TYPE_MAXIMUM(unsigned long long int)
bash-4.2/include/typemax.h:#    define INT_MAX     TYPE_MAXIMUM(int)

There's more output with the if statements and I can search for a command like awk too etc. I notice the regular expression I used doesn't catch anything about arbitrary precision tools I have such as bc and dc.

Questions

What is the rationale for not warning you (like awk does when evaluating 2^1024) when your arithmetic evaluation overflows? Why are the negative integers between 2⁶³ and 2⁶⁴-1 exposed to the end user when he's evaluating something?
I have read somewhere that some flavor of UNIX can interactively change ULONG_MAX? Has anyone heard of this?
If someone arbitrarily changes the value of the unsigned integer maximum in limits.h, then recompiles bash, what can we expect will happen?

^Note

^{1. I wanted to illustrate more clearly what I saw, as it is very simple empirical stuff. What I noticed is that:}

^{(a)Any evaluation that gives < 2^63-1 is correct}
^{(b)Any evaluation that gives => 2^63 up to 2^64 gives a negative

integer:}
- ^{The range of that integer is x to y. x = -9223372036854775808 and y = 0.}

^{Considering this, an evaluation which is like (b) can be expressed as

2^63-1 plus something within x..y. For instance if we're literally asked to evaluate (2^63-1)+100 002 (but could be any number smaller than in (a) ) we get -9223372036854675807. I'm just stating the obvious I guess but this also means that the two following expressions:}

^{(2^63-1) + 100 002 AND;}
^{(2^63-1) + (LLONG_MAX – {what the shell gives us for ((2^63-1) +

100 002), which is -9223372036854675807}) well, using positive values we have;}
- ^{(2^63-1) + (9223372036854775807 – 9223372036854675807 = 100 000)}
- ^{= 9223372036854775807 + 100 000}

^{are very close indeed. The second expression is "2" apart from (2^63-1) + 100 002 i.e. what we're evaluating. This is what I mean by you get negative integers showing you how far off from 2^64 you are. I mean with those negative integers and knowledge of the limits, well you cannot finish the evaluation within the x..y range in the bash shell but you can elsewhere – the data is usable up to 2^64 in that sense (I could add it up on paper or use it in bc). Beyond that however the behavior is similar to that of 6^6^6 as the limit is reached as described below in the Q…}

Best Answer

So between 2^63 and 2^64-1, you get negative integers showing you how far off from ULONG_MAX you are.

No. How do you figure that? By your own example, the max is:

> max=$((2**63 - 1)); echo $max
9223372036854775807

If "overflow" meant "you get negative integers showing you how far off from ULONG_MAX you are", then if we add one to that, shouldn't we get -1? But instead:

> echo $(($max + 1))
-9223372036854775808

Perhaps you mean this is a number you can add to $max to get a negative difference, since:

> echo $(($max + 1 + $max))
-1

But this does not in fact continue to hold true:

> echo $(($max + 2 + $max))
0

This is because the system uses two's complement to implement signed integers.¹ The value resulting from an overflow is NOT an attempt to provide you with a difference, a negative difference, etc. It is literally the result of truncating a value to a limited number of bits, then having it interpreted as a two's complement signed integer. For example, the reason $(($max + 1 + $max)) comes out as -1 is because the highest value in two's complement is all bits set except the highest bit (which indicates negative); adding these together basically means carrying all the bits to the left so you end up with (if the size were 16-bits, and not 64):

11111111 11111110

The high (sign) bit is now set because it carried over in the addition. If you add one more (00000000 00000001) to that, you then have all bits set, which in two's complement is -1.

I think that partially answers the second half of your first question -- "Why are the negative integers...exposed to the end user?". First, because that is the correct value according to the rules of 64-bit two's complement numbers. This is the conventional practice of most (other) general purpose high level programming languages (I cannot think of one that does not do this), so bash is adhering to convention. Which is also the answer to the first part of the first question -- "What's the rationale?": this is the norm in the specification of programming languages.

WRT the 2nd question, I have not heard of systems which interactively change ULONG_MAX.

If someone arbitrarily changes the value of the unsigned integer maximum in limits.h, then recompiles bash, what can we expect will happen?

It would not make any difference to how the arithmetic comes out, because this is not an arbitrary value that is used to configure the system -- it's a convenience value that stores an immutable constant reflecting the hardware. By analogy, you could redefine c to be 55 mph, but the speed of light will still be 186,000 miles per second. c is not a number used to configure the universe -- it's a deduction about the nature of the universe.

ULONG_MAX is exactly the same. It is deduced/calculated based on the nature of N-bit numbers. Changing it in limits.h would be a very bad idea if that constant is used somewhere assuming it is supposed to represent the reality of the system.

And you cannot change the reality imposed by your hardware.

^{1. I don't think that this (the means of integer representation) is actually guaranteed by bash, since it depends on the underlying C library and standard C does not guarantee that. However, this is what is used on most normal modern computers.}

Related Solutions

Bash – In bash, how to convert 8 bytes to an unsigned int (64bit LE)

Bash is the wrong tool altogether. Shells are good at gluing bits and pieces together; text processing and arithmetic are provided on the side, and data processing isn't in their purview at all.

I'd go for Python over Perl, because Python has bignums right off the bat. Use struct.unpack to unpack the data.

#!/usr/bin/env python
import os, struct, sys
fmt = "<" + "Q" * 8192
header_bytes = sys.stdin.read(65536)
header_ints = list(struct.unpack(fmt, header_bytes))
sys.stdin.seek(-65536, 2)
footer_bytes = sys.stdin.read(65536)
footer_ints = list(struct.unpack(fmt, header_bytes))
# your calculations here

Here's my answer to the original question. The revised question doesn't have much to do with the original, which was about converting one 8-byte sequence into the 64-bit integer it represents in little-endian order.

I don't think bash has any built-in feature for this. The following snippet sets a to a string that is the hexadecimal representation of the number that corresponds to the bytes in the specified string in big endian order.

a=0x$(printf "%s" "$string" |
      od -t x1 -An |
      tr -dc '[:alnum:]')

For little-endian order, reverse the order of the bytes in the original string. In bash, and for a string of known length, you can do

a=0x$(printf "%s" "${string:7:1}${string:6:1}${string:5:1}${string:4:1}${string:3:1}${string:2:1}${string:1:1}${string:0:1}" |
      od -t x1 -An |
      tr -dc '[:alnum:]')

You can also get your platform's prefered endianness if your od supports 8-byte types.

a=0x$(printf "%s" "$string" |
      od -t x8 -An |
      tr -dc '[:alnum:]')

Whether you can do arithmetic on $a will depend on whether your bash supports 8-byte arithmetic. Even if it does, it'll treat it as a signed value.

Alternatively, use Perl:

a=0x$(perl -e 'print unpack "Q<", $ARGV[0]' "$string")

If your perl is compiled without 64-bit integer support, you'll need to break the bytes up.

a=0x$(perl -e 'printf "%x%08x\n", reverse unpack "L<L<", $ARGV[0]' "$string")

(Replace < by > for big-endian or remove it to get the platform endianness.)

Bash – Comparison of Decimal Numbers

bash does not understand floating point numbers.
Quoting bash manual page, section ARITHMETIC EVALUATION:

Evaluation is done in fixed-width integers […].

So ((3 < 4)) or ((3 < 2)) are actually correct arithmetic expressions. You can type the following:

$ echo "$((3 < 4)) -- $((3 < 2))"

output: 1 -- 0

But $ echo $((3.3 < 3.6)) will return a syntax error message. In your example, you are actually comparing strings. Hence some example:

$ [[ ((3.56 < 04.90)) ]]; echo $?

output: 1

Best Answer

Related Solutions

Bash – In bash, how to convert 8 bytes to an unsigned int (64bit LE)

Bash – Comparison of Decimal Numbers

Related Question