dd – Best Way to Remove Bytes from Start of a File

dd

Today I had to remove the first 1131 bytes from an 800MB mixed text / binary file, a filtered subversion dump I'm hacking for a new repository. What's the best way to do this?

To start with I tried

dd bs=1 skip=1131 if=filtered.dump of=trimmed.dump

but after the skip this copies the remainder of the file a byte at a time, i.e. very slowly. In the end I worked out I needed 405 bytes to round this up to three blocks of 512 which I could skip

dd if=/dev/zero of=405zeros bs=1 count=405
cat 405zeros filtered.dump | dd bs=512 skip=3 of=trimmed.dump

which completed fairly quickly but there must have been a simpler / better way? Is there another tool I've forgotten about?

Best Answer

You can switch bs and skip options:

dd bs=1131 skip=1 if=filtered.dump of=trimmed.dump

This way the operation can benefit from a greater block.

Otherwise, you could try with tail (although it's not safe to use it with binary files):

tail -c +1132 filtered.dump >trimmed.dump

Finally, you may use 3 dd instances to write something like this:

dd if=filtered.dump bs=512k | { dd bs=1131 count=1 of=/dev/null; dd bs=512k of=trimmed.dump; }

where the first dd prints its standard output filtered.dump; the second one just reads 1131 bytes and throws them away; then, the last one reads from its standard input the remaining bytes of filtered.dump and write them to trimmed.dump.

Related Question