Bash Arithmetic – Troubleshooting Bitwise Operations Not Working

arithmeticbash

I ran into a strange problem. To demonstrate, let's take the largest unsigned number on my machine (printf "%X \n" -1 gives me FFFFFFFFFFFFFFFF), and try to shift some bits.  First, shift to the left:

printf "%X \n" $(( 0xFFFFFFFFFFFFFFFF<<4 ))
FFFFFFFFFFFFFFF0
printf "%X \n" $(( 0xFFFFFFFFFFFFFFFF<<8 ))
FFFFFFFFFFFFFF00
printf "%X \n" $(( 0xFFFFFFFFFFFFFFFF<<16 ))
FFFFFFFFFFFF0000

So far so good. As expected. Now let's try the right shift:

printf "%X \n" $(( 0xFFFFFFFFFFFFFFFF>>4 ))
FFFFFFFFFFFFFFFF
printf "%X \n" $(( 0xFFFFFFFFFFFFFFFF>>8 ))
FFFFFFFFFFFFFFFF
printf "%X \n" $(( 0xFFFFFFFFFFFFFFFF>>16 ))
FFFFFFFFFFFFFFFF

Wait, what?? Why is this not working? Is that a bug?


Edit:

I am dreading that someone will suggest some connection with the sign bit being raised. But we are not talking about arithmetic, so the concept of sign has no place here. Other tools like * and / are for arithmetic. The whole point of having a tool that can manipulate bits is to be able to manipulate bits — no matter how I'll chose to display those bits later, as signed or as unsigned. Right? Like:

printf "%u \n" -1
18446744073709551615

Any ideas anybody?

EDIT:

Since the answers here went straight to talking about multiplication or division, let me try to explain my concern more clearly. Multiplication/division and bit-shifting are two different things, although I can see the connection between them in the minds of long-time programmers. When doing arithmetic, you have to have the concept of sign; for bit-shifting you don't. Bash has given us two distinctly different sets of tools for these two different things. When I want to multiply a number by 2, I reach for the * tool. The fact that under the hood Bash can use bit-shifts for arithmetic is beyond the point.

To quote one of the answers…

If the sign bit wasn't copied, the result would turn into an unsigned number. E.g. shifting the 8-bit value 1111 0000 once to the right would give 0111 1000

But turning 1111 0000 into 0111 1000 is exactly what I want. If I wanted to do a division, then I would use arithmetic operstor instead.

Anyway, is there at least some way of explicitly specifying with what kind of bits it should fill when shifting?

Best Answer

There are two different ways to shift right in common use.

The "logical right shift" inserts zero bits on the left, so the result of shifting one bit to the right corresponds to dividing the unsigned binary number by two. echo $(( 16 >> 1 )) gives 8.

And, the "arithmetic right shift" inserts a copy of the sign bit on the left, so the result of shifting one bit to the right corresponds to dividing the signed binary number by two. echo $(( 16 >> 1 )) gives 8, and echo $(( -16 >> 1 )) gives -8. Except that on two's complement numbers, it doesn't match the rounding of an actual division: -15 >> 1 gives -8; while -15 / 2 gives -7.

If the sign bit wasn't copied, but zeroed, the result would be a positive number. E.g. shifting the 8-bit value 1111 0000 (0xf0, -16) once to the right would give 0111 1000 (0x78, +120).


Now, which one of these is used is a hairier matter.

In practice, many implementations would use the arithmetic shift for signed numbers, and shell arithmetic is mostly done on a signed long.

But that's not exactly guaranteed, at all. The POSIX definition for shell arithmetic refers to the C standard for most of the behaviour, and e.g. the operator table doesn't say anything about what sort of a shift >> is supposed to be. (see: Shell Command Language, 2.6.4 Arithmetic Expansion and Shell & Utilities, 1.1.2 Concepts Derived from the ISO C Standard: Arithmetic Precision and Operations)

Integer variables and constants, including the values of operands and option-arguments, [...] shall be implemented as equivalent to the ISO C standard signed long data type [...]

Arithmetic operators and control flow keywords shall be implemented as equivalent to those in the cited ISO C standard section, [...]
<<, >>: Section 6.5.7, Bitwise Shift Operators

cppreference.com says of the C operators that

For negative a, the value of a >> b is implementation-defined (in most implementations, this performs arithmetic right shift, so that the result remains negative).

(That may be a remnant of a world where not everything was two's complement. A shift to the right of a ones' complement or a sign-magnitude number would be different from the shift to the right of a two's complement number. But the result is the same: implementation-defined it is.)

Some other programming languages, like Javascript, have distinct operators for arithmetic right shift >>, and logical right shift >>>. But C doesn't, and neither do any of the shells I tried.

As an aside, if you were to do shifts with offsets greater than the word width, you'd also see strange things happen. On an x86, 1 << 64 is just 1, because the processor only looks at the lowest 6 bits of the shift value, so it's the same as 1 << 0. (1 << 32) << 32 is 0, though, and the result might be different on another processor.


You said,

But the concept of sign has no place here. I mean, a number is a number, regardless of whether later you'll chose to display it as signed or unsigned, right?

And that's true for addition, subtraction, and the low part of a multiplication (e.g. 32x32 -> 32) on a two's complement machine.

But it's not true for the high part of a multiplication or division in general. The 8-bit value 0xff can mean the unsigned number 255 or the signed number -1. An 8x8 -> 16 multiplication for e.g. 0xff * 0xff is either 0x0001 or 0xfe01, depending on if it's signed (-1 * -1) or unsigned (255 * 255). Also e.g. 0xff / 3 is either 0 or 0x55, depending on if it's signed (-1 / 3 == 0), or unsigned (255 / 3 == 85).

Related Question