Linux – Why specify block size when copying devices of a finite size

cloningddhard-disklinux

In online tutorials it is often suggested to use the following command to copy a CDROM to an iso image:

$ dd if=/dev/dvd of=foobar.iso bs=2048

Why must the byte size be specified? I notice that in fact 2048 is the standard byte size for CDROM images but it seems that dd without specifying bs= or count= works as well.

Under what circumstances would it be problematic to not specify bs= or count= when copying from a device of finite size?

Best Answer

When is dd suitable for copying data? (or, when are read() and write() partial) points out an important caveat when using count: dd can copy partial blocks, so when given count it will stop after the given number of blocks, even if some of the blocks were incomplete. You may therefore end up with fewer than bs * count bytes copied, unless you specify iflag=fullblock.

The default block size for dd is 512 bytes. count is a limit; as your question hints it isn't required when copying a device of finite size, and is really intended to copy only part of a device.

I think there are two aspects to consider here: performance and data recovery.

As far as performance is concerned, you ideally want the block size to be at least equal to, and a multiple of, the underlying physical block size (hence 2048 bytes when reading a CD-ROM). In fact nowadays you may as well specify larger block sizes to give the underlying caching systems a chance to buffer things for you. But increasing the block size means dd has to use that much more memory, and it could be counter-productive if you're copying over a network because of packet fragmentation.

As far as data recovery is concerned, you may retrieve more data from a failing hard disk if you use smaller block sizes; this is what programs such as dd-rescue do automatically: they read large blocks initially, but if a block fails they re-read it with smaller block sizes. dd won't do this, it will just fail the whole block.

Related Question