How to Use `dd` to Right-Shift Data Blocks

block-devicedd

Consider a 100MB raw block device as a simple example. That is 204800 blocks of 512 bytes each for a total of 102760448 bytes.

The challenge is to shift the first 98MB (200704 blocks) so there is a gap of 2MB (4096 blocks) in front of it. To do this in-place requires that nothing is written to a sector that has not been read. One way to achieve this is to introduce a buffer:

$ dd if=/dev/sdj2 count=200704 | mbuffer -s 512 -b 4096 -P 100 | dd of=/dev/sdj2 seek=4096

The expectation is that mbuffer will store 4096 blocks before passing anything to the writer, thus ensuring that nothing is written to an area that has not been read and that the writer lags the reader by the size of the buffer. The buffer should allow the reader and writer to operate as fast as possible within those constriants.

However, it doesn't seem to work reliably. I've tried using real devices but it never works on them, whereas experiments with a file worked on my 64-bit box but not on my 32-bit box.

First, some preparation:

$ dd if=/dev/sdj2 count=200704 | md5sum
0f0727f6644dac7a6ec60ea98ffc6da9
$ dd if=/dev/sdj2 count=200704 of=testfile

This doesn't work:

$ dd if=/dev/sdj2 count=200704 | mbuffer -s 512 -b 4096 -P 100 -H | dd of=/dev/sdj2 seek=4096
summary: 98.0 MiByte in  4.4sec - average of 22.0 MiB/s
md5 hash: 3cbf1ca59a250d19573285458e320ade

This works on 64-bit system but not on 32-bit system:

$ dd if=testfile count=200704 | mbuffer -s 512 -b 4096 -P 100 -H | dd of=testfile seek=4096 conv=notrunc
summary: 98.0 MiByte in  0.9sec - average of  111 MiB/s
md5 hash: 0f0727f6644dac7a6ec60ea98ffc6da9

How can this be done reliably?


notes

I have read other questions about buffering and looked at pv, buffer and mbuffer. I could only get the latter to work with the required buffer size.

Using intermetiate storage is an obvious solution to the problem that always works but it isn't practical when sufficient spare capacity isn't available.

Test platforms running Arch Linux with mbuffer version 20140302.

Best Answer

Without a buffer, you could go backwards, one block at a time.

for i in $(seq 100 -1 0)
do
    dd if=/dev/thing of=/dev/thing \
       bs=1M skip=$i seek=$(($i+2)) count=1
done

Please note that this example is dangerous due to lack of error checking.

It's also slow due to the amount of dd calls. If you have memory to spare, you could use a larger blocksize.

With a buffer, beware pitfalls. It is not sufficient to guarantee a 100% prefill. What you need is a minimum fill throughout the entire process. The buffer must never ever drop below 2M because otherwise you will have overwritten your yet-to-be-read data again.

So while in theory you could do without any kind of buffer and just chain dd:

dd if=/dev/thing bs=1M | \
dd bs=1M iflag=fullblock | \
dd bs=1M iflag=fullblock | \
dd of=/dev/thing bs=1M seek=2

In practice this does not work reliably because there is no guarantee the first dd manages to keep reading data, while the last dd (with 2M of "buffer" in between) is already writing.

You can increase your chances considerably by making the in between buffer considerably larger, but even so, it's not reliable.

Unfortunately I do not know a good buffer program with minimum fill property. You need one that stops output as long as there is less than your safety margin within the buffer.

Related Question