So between 2^63 and 2^64-1, you get negative integers showing you how far off from ULONG_MAX you are.
No. How do you figure that? By your own example, the max is:
> max=$((2**63 - 1)); echo $max
9223372036854775807
If "overflow" meant "you get negative integers showing you how far off from ULONG_MAX you are", then if we add one to that, shouldn't we get -1? But instead:
> echo $(($max + 1))
-9223372036854775808
Perhaps you mean this is a number you can add to $max
to get a negative difference, since:
> echo $(($max + 1 + $max))
-1
But this does not in fact continue to hold true:
> echo $(($max + 2 + $max))
0
This is because the system uses two's complement to implement signed integers.1 The value resulting from an overflow is NOT an attempt to provide you with a difference, a negative difference, etc. It is literally the result of truncating a value to a limited number of bits, then having it interpreted as a two's complement signed integer. For example, the reason $(($max + 1 + $max))
comes out as -1 is because the highest value in two's complement is all bits set except the highest bit (which indicates negative); adding these together basically means carrying all the bits to the left so you end up with (if the size were 16-bits, and not 64):
11111111 11111110
The high (sign) bit is now set because it carried over in the addition. If you add one more (00000000 00000001) to that, you then have all bits set, which in two's complement is -1.
I think that partially answers the second half of your first question -- "Why are the negative integers...exposed to the end user?". First, because that is the correct value according to the rules of 64-bit two's complement numbers. This is the conventional practice of most (other) general purpose high level programming languages (I cannot think of one that does not do this), so bash
is adhering to convention. Which is also the answer to the first part of the first question -- "What's the rationale?": this is the norm in the specification of programming languages.
WRT the 2nd question, I have not heard of systems which interactively change ULONG_MAX.
If someone arbitrarily changes the value of the unsigned integer maximum in limits.h, then recompiles bash, what can we expect will happen?
It would not make any difference to how the arithmetic comes out, because this is not an arbitrary value that is used to configure the system -- it's a convenience value that stores an immutable constant reflecting the hardware. By analogy, you could redefine c to be 55 mph, but the speed of light will still be 186,000 miles per second. c is not a number used to configure the universe -- it's a deduction about the nature of the universe.
ULONG_MAX is exactly the same. It is deduced/calculated based on the nature of N-bit numbers. Changing it in limits.h
would be a very bad idea if that constant is used somewhere assuming it is supposed to represent the reality of the system.
And you cannot change the reality imposed by your hardware.
1. I don't think that this (the means of integer representation) is actually guaranteed by bash
, since it depends on the underlying C library and standard C does not guarantee that. However, this is what is used on most normal modern computers.
Bash uses intmax_t
variables for arithmetic. On your system these are 64 bits in length, so:
$ echo $((1<<62))
4611686018427387904
which is
100000000000000000000000000000000000000000000000000000000000000
in binary (1 followed by 62 0s). Shift that again:
$ echo $((1<<63))
-9223372036854775808
which is
1000000000000000000000000000000000000000000000000000000000000000
in binary (63 0s), in two's complement arithmetic.
To get the biggest representable integer, you need to subtract 1:
$ echo $(((1<<63)-1))
9223372036854775807
which is
111111111111111111111111111111111111111111111111111111111111111
in binary.
As pointed out in ilkkachu's answer, shifting takes the offset modulo 64 on 64-bit x86 CPUs (whether using RCL
or SHL
), which explains the behaviour you're seeing:
$ echo $((1<<64))
1
is equivalent to $((1<<0))
. Thus $((1<<1025))
is $((1<<1))
, $((1<<1026))
is $((1<<2))
...
You'll find the type definitions and maximum values in stdint.h
; on your system:
/* Largest integral types. */
#if __WORDSIZE == 64
typedef long int intmax_t;
typedef unsigned long int uintmax_t;
#else
__extension__
typedef long long int intmax_t;
__extension__
typedef unsigned long long int uintmax_t;
#endif
/* Minimum for largest signed integral type. */
# define INTMAX_MIN (-__INT64_C(9223372036854775807)-1)
/* Maximum for largest signed integral type. */
# define INTMAX_MAX (__INT64_C(9223372036854775807))
Best Answer
Bash arithmetic uses signed numbers.
So the quick answer would be:
But since you want your script to not know about the bitness of the system it's running on, then let's keep going.
Brute force would be, keep adding 1 in a loop, until you hit the point where it will overflow unto a negative number. But that could take years! :-) A quicker and more elegant way to do it is with a simple bit-shift.
Let's find the sign bit, i.e., let's find the number that has
1
in the most signifficant bit, and zeros in all the other bits, however many they may be. Once we have that number, we'll simply subtract1
from it, and we'll get the largest signed number.Or, here's a one-liner, without a loop. We put the hex representation of a number in a variable, and then mask the sign bit through the variable expantion when passing it to the
printf
builtin:On a machine with a different bitness than mine, the result will be a different number.
And just for illustration, in my case:
A little side note about MIN: You may want to constrain yourself to using
((MIN=-MAX))
, otherwise you will occasionally run into problems with some arithmetic operations.