I am looking for a way to clone single disk drive to more than one disk drive at the same time.
I have prepared system images on 1TB disks, and it takes almost 2 hours to clone one disk to another, and then it goes up exponentially, in order to have say 30 disks cloned.
If it was possible to clone one disk to more than single target, it would simplify whole procedure a lot.
Also, is there something that prevents this kind of operation? I mean, is there some special reason why every disk cloning software that I know about supports only single target drive?
Thanks!
Best Answer
You can use bash's "process substitution" along with the tee command to do this:
or for clarity (at the expense of a little efficiency) you can make the last
dd
be called the same way as the others and send the stdout of tee to /dev/null:and if you have it installed you can use pipe viewer instead of
cat
to get a useful progress indicator:This reads the source image only once, so the source drive does suffer head-thrashing which will probably be why you see exponential slow-down when you try copy the image multiple times by other methods. Using
tee
like above, the processes should run at the speed of the slowest destination drive.If you have the destination drives are connected via USB, be aware that they may all be sharing bus bandwidth so writing many in parallel may be no faster than writing them in sequentially because the USB bus becomes the bottleneck not the source or destination drives.
The above assumes you are using Linux or similar (it should work on OSX too though the device names may be different), if you are using Windows or something else then you need a different solution.
Edit
Imaging over the network has a similar problem to imaging many drives over USB - the transport channel becomes the bottleneck instead of the drives - unless the software you use supports some form of broadcast or multicast transmission.
For the
dd
method you could probably daisy-chainnetcat
+tee
+dd
processes on each machine like so:cat
/pv
/dd
s the data throughnc
to destination machine 1.nc
listening for the data from the source machine and piping it throughtee
which is in turn sending it todd
(and so to the disk) and anothernc
process which sends to destination machine 2.nc
listening for the data from the destination machine 1 and piping it throughtee
which is in turn sending it todd
(and so to the disk) and anothernc
process which sends to destination machine 3.nc
picking up the data from the previous machine and sending it to disk viadd
.This way you are potentially using your full network bandwidth assuming that you your switch and network cards have all negotiated a full-duplex link. Instead of the source machine sending 10 copies of the data out (assuming 10 destination machines) so each is limited to 1/10th of the outgoing bandwidth it is only sending 1. Each destination machine is taking one copy of the data and sending it out again. You might need to tweak the buffer size settings of
pv
,nc
anddd
to get closer to best practical performance.If you can find some software that just supports multicast though, that would be much easier (and probably a little faster)! But the above is the sort of hacky solution I might be daft enough to try...
Edit Again
Another thought. If the drive image compresses well (which it will if large chunks of it are full of zeros) the outgoing bandwidth of the source machine need not be a problem even if sending to many destinations at once. Just compress the image first, transmit that to everywhere using
tee
+nc
, and decompress on the destinations (network->nc
->decompressor->dd
->disk).