How large is the img that dd creates

ddhard-disk

This is the first time I use the dd command.
I execute:

dd if=/dev/sdb2 of=/mnt/sdc1/Hdd1.img bs=512 conv=noerror,sync

Where the sdb is the destroyed hdd (size: 500 GB).
I copying the partition sdb2 into an image. I've done it 6(!!) days. the img size is about 640 GB and still counting (i.e: it hasn't finished yet…). 6 days it's printing the copied data details (which byte it's copied to where) and it's not stopping.

Is it normal? how is it possible that the img size is larger than the whole destroyed hdd size? and when it suppose to finishing?

Best Answer

By doing the copy 512 bytes at a time you are doing lots and lots of reads and writes. About a trillion, actually, if you do the math. You've also asked for sync [EDIT: this is not oflag=sync so the next statement is invalid], which means to wait for each write to actually make it out to disk before that write can return. Let's say your disk is pretty speedy so each write takes 2ms.

500gb / 512 bytes * 2ms = 22.6 days.

Wow, trillions billions of milliseconds added up fast, didn't they?

[EDIT: while that was certainly a fun bit of math, it's not accurate since oflag=sync wasn't used. The delays are more likely due to repeatedly reading bad sectors and those associated timeouts. The below dd_rescue approach should help quite a bit. Using plain dd with a larger block size might help, but not as much since it can't adapt its read size and won't skip over massive damage.]

If you use a larger block size and/or skipped the sync it will run MUCH faster:

# dd if=/dev/sdb2 of=/sdb2-image.img bs=1024k

If you're concerned about read errors on the sdb2 image read, use dd_rescue with the -A option to write out a block of zeroes instead of skipping that write. Skipping blocks with errors entirely can lead to problems when certain filesystem structures appear at different offsets from the start than they were originally. It's better to just have some unexpected zeroes. For example:

# dd_rescue -A /dev/sdb2 /sdb2-image.img

This will start out reading large blocks of data at once and only reduces it when it starts hitting errors.

EDIT: to directly answer the question, as suggesed by Micheal Johnson, when using conv=noerror,sync on dd or -A on dd_rescue, your image will end up the exact same size as your source. This is because every read will generate an identically sized write. Some versions of dd may keep running long past the end of the device since they ignore the end-of-file "error" per your conv=noerror request. I don't think Linux does this, but it's something to watch out for if your image seems to be getting larger than the source.

Related Question