I do have the exact same problem than you for years as well.
For simple non-interactive uses, I like to use the binary block editor BBE.
BBE is to binary as SED is to text, including its archaic syntax and simplicity, however, it has a lot of features missing from what I often need, so I have to combine it with other tools. So, BBE is only a partial solution.
Also note that BBE hasn't had any updates or improvements for years.
Of course one can use xxd
before and xxd -r
after editing the data with text-based tools, but that won't work when the data in question is large and random access is required, for example when processing block devices.
(Note: For Windows, there is at least the costly, proprietary WinHex scripting language, but that won't get us anywhere.)
For more complicated binary editing, I usually fall back to Python as well, even though it sometimes is too slow for large files, which is it's main drawback. I hope Pyston (Python employing LLVM to compile to optimized machine code) will someday mature enough to be usable, or even better, someone will design and implement a free compact, fast and versatile binary processing scripting language, which AFAIK doesn't exist for U*IX like systems yet.
UPDATE
I also happen to use the homebrew, open source Intel x86 assembler flat assembler, or fasm for short, that evolved into much more than just an assembler.
It has a powerful, textblock-based macro preprocessor (itself a turing complete language) with a syntax in the tradition of the borland turbo assembler macro language, but much more advanced.
Also, it has a data manipulation language, which allows to binary include arbitrary files, do all kinds of binary and arithmetic manipulation on it (integer only) at "compile time" and write the result into an output file. This data manipulation language has control strutures and is also turing complete.
It is much easier to use than writing a program that does some binary manipulation in C and probably even in python. Plus, it loads blindingly fast, as it is a small sized executable with almost no external dependecies (There are 2 versions: either it only requires libc or it can run as a static executable directly on the Linux kernel ABI).
It does have some ruff edges, like
not supporting concurrency
being writting in 32 bit x86 assembly (works on x86_64 though), you probably need qemu or a similar emulator if you want to run it on anything else than x86 or x86_64
it's powerful macro preprocessor language is turing complete, this means you better have some experience with languages like Lisp, Haskell, XSLT, or probably M4 would be the best choice.
all data that is to be written into the output file are performed in a "flat" buffer in memory, and this buffer can grow but not shrink until the output file has been written and fasm terminated. This means that one can only generate files at most as large as you have main memory left in a single run of fasm.
data can only be written into a single output file for each run of fasm
yeah, it is homebrew, a really neat and clever one though
If your timestamps are consistently formated, you could strip them off (with sed, for example) before processing the files with whatever differencing method, e.g.
diff <(sed -E 's|[0-9]{2}:[0-9]{2}:[0-9]{2} [0-9]{2}/[0-9]{2}/[0-9]{2,4} [0-9]{1,} ||' fileA) <(sed -E 's|[0-9]{2}:[0-9]{2}:[0-9]{2} [0-9]{2}/[0-9]{2}/[0-9]{2,4} [0-9]{1,} ||' fileB)
Testing on your supplied input files:
$ diff \
<(sed -E 's|[0-9]{2}:[0-9]{2}:[0-9]{2} [0-9]{2}/[0-9]{2}/[0-9]{2,4} [0-9]{1,} ||' fileA) \
<(sed -E 's|[0-9]{2}:[0-9]{2}:[0-9]{2} [0-9]{2}/[0-9]{2}/[0-9]{2,4} [0-9]{1,} ||' fileB)
2,3c2,3
< abc xxx
< ghi eee ddd
---
> abc def
> ghi fff ddd
Best Answer
This can easily be done with
diff
. For example:In the example above, the
foo/
andbar/
directories contain binary files andbash2
is only infoo/
.So, you could run something simple like:
That will show you the different files, if any, or print "The directories' contents are identical" if they are. To compare subdirectories and any files they may contain as well, use
diff -r
. Combine it with-q
to suppress the output for text files.