Bash – In bash, how to convert 8 bytes to an unsigned int (64bit LE)

arithmeticbash

How can I 'read/interpret' 8 bytes as an unsigned int (Little Endian)?
Perhaps there is a Bash-fu magic conversion for this?

UPDATE:
It seems that something got cross-wired in the interpretation of my question. Here is a broader example of what I am trying to do.

I want to read the first (and last) 64k of a file. Each 8-byte word is to be interpreted as a 64-bit Little-Endian unsigned integer. These integers are to be used in a hashing computation which uniquely identifies the file. So there are a lot of calculations to make, ∴ speed is preferred, but not critical. (Why am I doing it? Because smplayer hashes the names of its played-media .ini files, and I want to access, and modify these files, so I am mimicking the smplayer's C++ code in Bash.)

A solution which caters to accepting a piped input would be optimal, and is probably essential because of the way Bash variables can't handle \x00..

I realize that something like this is probably better suited to the likes of Python, Perl, and C/C++, but I don't know Python and Perl, and although I could do it in C++, it's been years since I've used it and I'm trying to focus on Bash.

Short Perl and Python snippets are good. Bash is preferred (but not at the sacrifice of speed).

Best Answer

Bash is the wrong tool altogether. Shells are good at gluing bits and pieces together; text processing and arithmetic are provided on the side, and data processing isn't in their purview at all.

I'd go for Python over Perl, because Python has bignums right off the bat. Use struct.unpack to unpack the data.

#!/usr/bin/env python
import os, struct, sys
fmt = "<" + "Q" * 8192
header_bytes = sys.stdin.read(65536)
header_ints = list(struct.unpack(fmt, header_bytes))
sys.stdin.seek(-65536, 2)
footer_bytes = sys.stdin.read(65536)
footer_ints = list(struct.unpack(fmt, header_bytes))
# your calculations here

Here's my answer to the original question. The revised question doesn't have much to do with the original, which was about converting one 8-byte sequence into the 64-bit integer it represents in little-endian order.

I don't think bash has any built-in feature for this. The following snippet sets a to a string that is the hexadecimal representation of the number that corresponds to the bytes in the specified string in big endian order.

a=0x$(printf "%s" "$string" |
      od -t x1 -An |
      tr -dc '[:alnum:]')

For little-endian order, reverse the order of the bytes in the original string. In bash, and for a string of known length, you can do

a=0x$(printf "%s" "${string:7:1}${string:6:1}${string:5:1}${string:4:1}${string:3:1}${string:2:1}${string:1:1}${string:0:1}" |
      od -t x1 -An |
      tr -dc '[:alnum:]')

You can also get your platform's prefered endianness if your od supports 8-byte types.

a=0x$(printf "%s" "$string" |
      od -t x8 -An |
      tr -dc '[:alnum:]')

Whether you can do arithmetic on $a will depend on whether your bash supports 8-byte arithmetic. Even if it does, it'll treat it as a signed value.

Alternatively, use Perl:

a=0x$(perl -e 'print unpack "Q<", $ARGV[0]' "$string")

If your perl is compiled without 64-bit integer support, you'll need to break the bytes up.

a=0x$(perl -e 'printf "%x%08x\n", reverse unpack "L<L<", $ARGV[0]' "$string")

(Replace < by > for big-endian or remove it to get the platform endianness.)