Interpreting octal dump without options

od

$ echo "hello" | od
0000000 062550 066154 005157
0000006

I know that the first column represents the byte offset. But I don't see how the other numbers are formed. According to man the above should be "octal bytes". However the option -b is supposed to "select octal bytes" as well and it prints something different:

$echo "hello" | od -b
0000000 150 145 154 154 157 012
0000006

EDIT: This is by the way what I would expect to appear i.e. the ascii values of all characters in 'hello\n' as what I would expect to be called "octal bytes".

Best Answer

od doesn't show bytes by default, it shows words in octal. This may not quite be intuitive, but don't forget od is a very old command :-) I'll use a somewhat simpler example than you did:

$ echo -en '\01\02' | od
0000000 001001
0000002

As Intel uses a little-endian architecture, the bytes \01\02 are interpreted as 00000010 00000001 in binary.

As octal digits each represent 3 bits, we can group that number like this:

(0)(000)(001)(000)(000)(001)

So the octal representation of those 2 bytes is:

001001

For day to day use this is pretty useless; perhaps back in the day it was handy for manually debugging memory dumps :-)

Your hello\n example is:

h = 01101000
e = 01100101
l = 01101100
l = 01101100
o = 01101111
\n= 00001010

It's a bit more complicated now, because octal digits represent 3 bits, but bytes are 8 bits; so padding is added :-( The result symbollicaly is:

PehPllP\no

Remember, each set of 2 bytes is swapped due to the endianness. The P is a padding of 2 bits. The result in octal is (using a slash as separator):

00/01100101/01101000/00/01101100/01101100/00/00001010/01101111

Now in octal groups of 3 bits:

000 110 010 101 101 000 000 110 110 001 101 100 000 000 101 001 101 111

Translated into octal digits:

062550066154005157

This matches your result.

In conclusion you've probably learnt that od without options is worse than useless :-)

Related Question