Bash – Find the Largest Number Bash Arithmetic Can Handle

arithmeticbash

How can I let my script determine the largest number for itself?

I looked through my environment variables, and I found these two that looked promising:

~# declare -p BASH_VERSINFO HOSTTYPE
declare -ar BASH_VERSINFO=([0]="5" [1]="0" [2]="11" [3]="1" [4]="release" [5]="x86_64-slackware-linux-gnu")
declare -- HOSTTYPE="x86_64"

…but could I really trust parsing those, in order to draw a conclusion about what the largest number in Bash arithmetic would be? There must be a better way, programmatically. Any suggestions?

Best Answer

Bash arithmetic uses signed numbers.

So the quick answer would be:

((MAX=(1<<63)-1))

But since you want your script to not know about the bitness of the system it's running on, then let's keep going.

Brute force would be, keep adding 1 in a loop, until you hit the point where it will overflow unto a negative number. But that could take years! :-) A quicker and more elegant way to do it is with a simple bit-shift.

Let's find the sign bit, i.e., let's find the number that has 1 in the most signifficant bit, and zeros in all the other bits, however many they may be. Once we have that number, we'll simply subtract 1 from it, and we'll get the largest signed number.

# MIN -- the smallest signed number 0x8000...00  (it equals MAX+1)
# MAX -- the largest signed number  0x7Fff...FF  <-- what we are looking for

MIN=1; until (( (MIN<<=1) < 0 )) ;do :;done
((MAX=MIN-1))

echo $MAX

Result:
9223372036854775807

Or, here's a one-liner, without a loop. We put the hex representation of a number in a variable, and then mask the sign bit through the variable expantion when passing it to the printf builtin:

printf -v MAX %x -1 && printf -v MAX %d 0x${MAX/f/7}

echo $MAX

Result:
9223372036854775807

On a machine with a different bitness than mine, the result will be a different number.

And just for illustration, in my case:

printf "MAX %X  %d\nMIN %X %d\n" $MAX $MAX $MIN $MIN
MAX 7FFFFFFFFFFFFFFF  9223372036854775807
MIN 8000000000000000 -9223372036854775808

A little side note about MIN: You may want to constrain yourself to using ((MIN=-MAX)), otherwise you will occasionally run into problems with some arithmetic operations.

((MIN=-MAX)) ; printf "MIN %X %d\n" $MIN $MIN
MIN 8000000000000001 -9223372036854775807

Related Solutions

Why Bash Shell Doesn’t Warn About Arithmetic Overflow

So between 2^63 and 2^64-1, you get negative integers showing you how far off from ULONG_MAX you are.

No. How do you figure that? By your own example, the max is:

> max=$((2**63 - 1)); echo $max
9223372036854775807

If "overflow" meant "you get negative integers showing you how far off from ULONG_MAX you are", then if we add one to that, shouldn't we get -1? But instead:

> echo $(($max + 1))
-9223372036854775808

Perhaps you mean this is a number you can add to $max to get a negative difference, since:

> echo $(($max + 1 + $max))
-1

But this does not in fact continue to hold true:

> echo $(($max + 2 + $max))
0

This is because the system uses two's complement to implement signed integers.¹ The value resulting from an overflow is NOT an attempt to provide you with a difference, a negative difference, etc. It is literally the result of truncating a value to a limited number of bits, then having it interpreted as a two's complement signed integer. For example, the reason $(($max + 1 + $max)) comes out as -1 is because the highest value in two's complement is all bits set except the highest bit (which indicates negative); adding these together basically means carrying all the bits to the left so you end up with (if the size were 16-bits, and not 64):

11111111 11111110

The high (sign) bit is now set because it carried over in the addition. If you add one more (00000000 00000001) to that, you then have all bits set, which in two's complement is -1.

I think that partially answers the second half of your first question -- "Why are the negative integers...exposed to the end user?". First, because that is the correct value according to the rules of 64-bit two's complement numbers. This is the conventional practice of most (other) general purpose high level programming languages (I cannot think of one that does not do this), so bash is adhering to convention. Which is also the answer to the first part of the first question -- "What's the rationale?": this is the norm in the specification of programming languages.

WRT the 2nd question, I have not heard of systems which interactively change ULONG_MAX.

If someone arbitrarily changes the value of the unsigned integer maximum in limits.h, then recompiles bash, what can we expect will happen?

It would not make any difference to how the arithmetic comes out, because this is not an arbitrary value that is used to configure the system -- it's a convenience value that stores an immutable constant reflecting the hardware. By analogy, you could redefine c to be 55 mph, but the speed of light will still be 186,000 miles per second. c is not a number used to configure the universe -- it's a deduction about the nature of the universe.

ULONG_MAX is exactly the same. It is deduced/calculated based on the nature of N-bit numbers. Changing it in limits.h would be a very bad idea if that constant is used somewhere assuming it is supposed to represent the reality of the system.

And you cannot change the reality imposed by your hardware.

^{1. I don't think that this (the means of integer representation) is actually guaranteed by bash, since it depends on the underlying C library and standard C does not guarantee that. However, this is what is used on most normal modern computers.}

Bash – Bitwise Shift and the Largest Integer

Bash uses intmax_t variables for arithmetic. On your system these are 64 bits in length, so:

$ echo $((1<<62))
4611686018427387904

which is

100000000000000000000000000000000000000000000000000000000000000

in binary (1 followed by 62 0s). Shift that again:

$ echo $((1<<63))
-9223372036854775808

which is

1000000000000000000000000000000000000000000000000000000000000000

in binary (63 0s), in two's complement arithmetic.

To get the biggest representable integer, you need to subtract 1:

$ echo $(((1<<63)-1))
9223372036854775807

which is

111111111111111111111111111111111111111111111111111111111111111

in binary.

As pointed out in ilkkachu's answer, shifting takes the offset modulo 64 on 64-bit x86 CPUs (whether using RCL or SHL), which explains the behaviour you're seeing:

$ echo $((1<<64))
1

is equivalent to $((1<<0)). Thus $((1<<1025)) is $((1<<1)), $((1<<1026)) is $((1<<2))...

You'll find the type definitions and maximum values in stdint.h; on your system:

/* Largest integral types.  */
#if __WORDSIZE == 64
typedef long int                intmax_t;
typedef unsigned long int       uintmax_t;
#else
__extension__
typedef long long int           intmax_t;
__extension__
typedef unsigned long long int  uintmax_t;
#endif

/* Minimum for largest signed integral type.  */
# define INTMAX_MIN             (-__INT64_C(9223372036854775807)-1)
/* Maximum for largest signed integral type.  */
# define INTMAX_MAX             (__INT64_C(9223372036854775807))

Best Answer

Related Solutions

Why Bash Shell Doesn’t Warn About Arithmetic Overflow

Bash – Bitwise Shift and the Largest Integer

Related Question