Cloning single disk drive to multiple drives simultaneously

hard drive

I am looking for a way to clone single disk drive to more than one disk drive at the same time.

I have prepared system images on 1TB disks, and it takes almost 2 hours to clone one disk to another, and then it goes up exponentially, in order to have say 30 disks cloned.

If it was possible to clone one disk to more than single target, it would simplify whole procedure a lot.

Also, is there something that prevents this kind of operation? I mean, is there some special reason why every disk cloning software that I know about supports only single target drive?

Thanks!

Best Answer

You can use bash's "process substitution" along with the tee command to do this:

cat drive.image | tee >(dd of=/dev/sda) >(dd of=/dev/sdb) >(dd of=/dev/sdc) | dd of=/dev/sdd

or for clarity (at the expense of a little efficiency) you can make the last dd be called the same way as the others and send the stdout of tee to /dev/null:

cat drive.image | tee >(dd of=/dev/sda) >(dd of=/dev/sdb) >(dd of=/dev/sdc) >(dd of=/dev/sdd) | /dev/null

and if you have it installed you can use pipe viewer instead of cat to get a useful progress indicator:

pv drive.image | tee >(dd of=/dev/sda) >(dd of=/dev/sdb) >(dd of=/dev/sdc) | dd of=/dev/sdd

This reads the source image only once, so the source drive does suffer head-thrashing which will probably be why you see exponential slow-down when you try copy the image multiple times by other methods. Using tee like above, the processes should run at the speed of the slowest destination drive.

If you have the destination drives are connected via USB, be aware that they may all be sharing bus bandwidth so writing many in parallel may be no faster than writing them in sequentially because the USB bus becomes the bottleneck not the source or destination drives.

The above assumes you are using Linux or similar (it should work on OSX too though the device names may be different), if you are using Windows or something else then you need a different solution.

Edit

Imaging over the network has a similar problem to imaging many drives over USB - the transport channel becomes the bottleneck instead of the drives - unless the software you use supports some form of broadcast or multicast transmission.

For the dd method you could probably daisy-chain netcat + tee + dd processes on each machine like so:

  1. Source machine cat/pv/dds the data through nc to destination machine 1.
  2. Destination machine 1 has nc listening for the data from the source machine and piping it through tee which is in turn sending it to dd (and so to the disk) and another nc process which sends to destination machine 2.
  3. Destination machine 2 has nc listening for the data from the destination machine 1 and piping it through tee which is in turn sending it to dd (and so to the disk) and another nc process which sends to destination machine 3.
  4. and so on until the last machine which just has nc picking up the data from the previous machine and sending it to disk via dd.

This way you are potentially using your full network bandwidth assuming that you your switch and network cards have all negotiated a full-duplex link. Instead of the source machine sending 10 copies of the data out (assuming 10 destination machines) so each is limited to 1/10th of the outgoing bandwidth it is only sending 1. Each destination machine is taking one copy of the data and sending it out again. You might need to tweak the buffer size settings of pv, nc and dd to get closer to best practical performance.

If you can find some software that just supports multicast though, that would be much easier (and probably a little faster)! But the above is the sort of hacky solution I might be daft enough to try...

Edit Again

Another thought. If the drive image compresses well (which it will if large chunks of it are full of zeros) the outgoing bandwidth of the source machine need not be a problem even if sending to many destinations at once. Just compress the image first, transmit that to everywhere using tee+nc, and decompress on the destinations (network->nc->decompressor->dd->disk).

Related Question