Bash – ASCII to Binary and Binary to ASCII conversion tools

asciibashbinary

Which is a good tool to convert ASCII to binary, and binary to ASCII?

I was hoping for something like:

$ echo --binary "This is a binary message"
01010100 01101000 01101001 01110011 00100000 01101001 01110011 00100000 01100001 00100000 01100010 01101001 01101110 01100001 01110010 01111001 00100000 01101101 01100101 01110011 01110011 01100001 01100111 01100101

Or, more realistic:

$ echo "This is a binary message" | ascii2bin
01010100 01101000 01101001 01110011 00100000 01101001 01110011 00100000 01100001 00100000 01100010 01101001 01101110 01100001 01110010 01111001 00100000 01101101 01100101 01110011 01110011 01100001 01100111 01100101

And also the reverse:

$ echo "01010100 01101000 01101001 01110011 00100000 01101001 01110011 00100000 01100001 00100000 01100010 01101001 01101110 01100001 01110010 01111001 00100000 01101101 01100101 01110011 01110011 01100001 01100111 01100101" | bin2ascii
This is a binary message

PS: I'm using bash

PS2: I hope I didn't get the wrong binary

Best Answer

$ echo AB | perl -lpe '$_=unpack"B*"'
0100000101000010
$ echo 0100000101000010 | perl -lpe '$_=pack"B*",$_'
AB

-e expression evaluate the given expression as perl code
-p: sed mode. The expression is evaluated for each line of input, with the content of the line stored in the $_ variable and printed after the evaluation of the expression.
-l: even more like sed: instead of the full line, only the content of the line (that is, without the line delimiter) is in $_ (and a newline is added back on output). So perl -lpe code works like sed code except that it's perl code as opposed to sed code.
unpack "B*" works on the $_ variable by default and extracts its content as a bit string walking from the highest bit of the first byte to the lowest bit of the last byte.
pack does the reverse of unpack. See perldoc -f pack for details.

With spaces:

$ echo AB | perl -lpe '$_=join " ", unpack"(B8)*"'
01000001 01000010
$ echo 01000001 01000010 | perl -lape '$_=pack"(B8)*",@F'
AB

(it assumes the input is in blocks of 8 bits (0-padded)).

With unpack "(B8)*", we extract 8 bits at a time, and we join the resulting strings with spaces with join " ".

Related Solutions

How to use Bash to find 2 bytes in a binary file, increase their values, and replace

Testing with this file:

$ echo hello world > test.txt
$ echo -n $'\x1b\x1f' >> test.txt
$ echo whatever >> test.txt
$ hexdump -C test.txt 
00000000  68 65 6c 6c 6f 20 77 6f  72 6c 64 0a 1b 1f 77 68  |hello world...wh|
00000010  61 74 65 76 65 72 0a                              |atever.|
$ grep -a -b --only-matching $'\x1b\x1f' test.txt 
12:

So in this case the 1B 1F is at position 12.

Convert to integer (there is probably an easier way)

$ echo 'ibase=16; '`xxd -u -ps -l 2 -s 12 test.txt`  | bc
6943

And the reverse:

$ printf '%04X' 6943 | xxd -r -ps | hexdump -C
00000000  1b 1f                                             |..|
$ printf '%04X' 4242 | xxd -r -ps | hexdump -C
00000000  10 92                                             |..|

And putting it back in the file:

$ printf '%04X' 4242 | xxd -r -ps | dd of=test.txt bs=1 count=2 seek=12 conv=notrunc
2+0 records in
2+0 records out
2 bytes (2 B) copied, 5.0241e-05 s, 39.8 kB/s

Result:

$ hexdump -C test.txt
00000000  68 65 6c 6c 6f 20 77 6f  72 6c 64 0a 10 92 77 68  |hello world...wh|
00000010  61 74 65 76 65 72 0a                              |atever.|

A shell-like environment for binary processing

I do have the exact same problem than you for years as well.

For simple non-interactive uses, I like to use the binary block editor BBE. BBE is to binary as SED is to text, including its archaic syntax and simplicity, however, it has a lot of features missing from what I often need, so I have to combine it with other tools. So, BBE is only a partial solution. Also note that BBE hasn't had any updates or improvements for years.

Of course one can use xxd before and xxd -r after editing the data with text-based tools, but that won't work when the data in question is large and random access is required, for example when processing block devices.

(Note: For Windows, there is at least the costly, proprietary WinHex scripting language, but that won't get us anywhere.)

For more complicated binary editing, I usually fall back to Python as well, even though it sometimes is too slow for large files, which is it's main drawback. I hope Pyston (Python employing LLVM to compile to optimized machine code) will someday mature enough to be usable, or even better, someone will design and implement a free compact, fast and versatile binary processing scripting language, which AFAIK doesn't exist for U*IX like systems yet.

UPDATE

I also happen to use the homebrew, open source Intel x86 assembler flat assembler, or fasm for short, that evolved into much more than just an assembler.

It has a powerful, textblock-based macro preprocessor (itself a turing complete language) with a syntax in the tradition of the borland turbo assembler macro language, but much more advanced.

Also, it has a data manipulation language, which allows to binary include arbitrary files, do all kinds of binary and arithmetic manipulation on it (integer only) at "compile time" and write the result into an output file. This data manipulation language has control strutures and is also turing complete.

It is much easier to use than writing a program that does some binary manipulation in C and probably even in python. Plus, it loads blindingly fast, as it is a small sized executable with almost no external dependecies (There are 2 versions: either it only requires libc or it can run as a static executable directly on the Linux kernel ABI).

It does have some ruff edges, like

not supporting concurrency
being writting in 32 bit x86 assembly (works on x86_64 though), you probably need qemu or a similar emulator if you want to run it on anything else than x86 or x86_64
it's powerful macro preprocessor language is turing complete, this means you better have some experience with languages like Lisp, Haskell, XSLT, or probably M4 would be the best choice.
all data that is to be written into the output file are performed in a "flat" buffer in memory, and this buffer can grow but not shrink until the output file has been written and fasm terminated. This means that one can only generate files at most as large as you have main memory left in a single run of fasm.
data can only be written into a single output file for each run of fasm
yeah, it is homebrew, a really neat and clever one though

Best Answer

Related Solutions

How to use Bash to find 2 bytes in a binary file, increase their values, and replace

A shell-like environment for binary processing

Related Question