Create a 2 TB RAID-1 array out of one 2 TB disk and two 1 TB disks while preserving 1 TB of data

mdadmraidsoftware-raid

I currently have 1 TB of data on a RAID-1 mdadm array. This array consists of one 1 TB disk and a 1 TB partition on a 2 TB disk. I have now bought a second 1 TB disk, and I would like to use this to create a 2 TB RAID-1 array.

The data on this array will be encrypted with dm-crypt, so I want to overwrite each disk with data from /dev/urandom before putting data on it.

The question is if I can create this 2 TB RAID-1 array, and securely overwrite each disk with random data while preserving the 1 TB of data I have on my current 1 TB RAID-1 array.

So to recap: I have 2 x 1 TB disks and 1 x 2 TB disk. I want to create a 2 TB RAID-1 array where one side of the mirror consists of two 1 TB disks pooled together, and the other side of the mirror is the whole 2 TB disk. Also, I want to preserve 1 TB of data using only these disks, while still being able to overwrite each disk with random data.

Best Answer

As for overwriting each disk with random data, it's redundant. Since you're going to build a new encrypted RAID, the resync will overwrite everything anyway.

As for the method of overwriting, /dev/urandom is horribly slow, people who try to use it for wiping terabytes usually cancel halfway through because it just takes way too long. Encrypting the device with a random key and then wiping that with /dev/zero is faster, and shred -n 1 is much faster still. So if you must have random data on your disk, I recommend you use those methods instead.

Now to your RAID, I'd do the following: (in this example loop2 is the 2TB disk)

  • Add the new 1TB disk partition to the RAID-1. Wait until the sync is finished. This way, your 1TB of data spans three disks.

    $ mdadm /dev/md99 --grow --raid-devices=3 --add /dev/loop1p1
    mdadm: added /dev/loop1p1
    raid_disks for /dev/md99 set to 3
    $ cat /proc/mdstat
    md99 : active raid1 loop1p1[2] loop2p1[1] loop0p1[0]
          100224 blocks super 1.2 [3/3] [UUU]
    
  • remove the 2TB disk from the RAID-1 array. Your 1TB is still preserved redundantly on the two 1TB disks.

    $ mdadm /dev/md99 --fail /dev/loop2p1
    mdadm: set /dev/loop2p1 faulty in /dev/md99
    $ mdadm /dev/md99 --remove /dev/loop2p1
    mdadm: hot removed /dev/loop2p1 from /dev/md99
    $ mdadm /dev/md99 --grow --raid-devices=2
    raid_disks for /dev/md99 set to 2
    $ cat /proc/mdstat
    md99 : active raid1 loop1p1[2] loop0p1[0]
          100224 blocks super 1.2 [2/2] [UU]
    
  • wipe the 2TB disk

    $ shred -n 1 /dev/loop2
    
  • repartition the disk to 2TB. Note that you will need a boot partition if you have no other boot device, as you can not boot from encrypted devices.

    $ parted /dev/loop2
    
  • create a new RAID-1 array using that partition, and missing for the 2nd device. The 2nd device will be added later on. Also, the size will be limited for now, to be grown later, as we're not sure of the size of the 2x1TB disks at this point. (If you are sure, feel free to use a different size here, but you won't be able to add the 2x1TB later if you make it too large).

    $ mdadm /dev/md42 --create --level=1 --raid-devices=2 --size=1000G /dev/loop2p1 missing
    mdadm: largest drive (/dev/loop2p1) exceeds size (102400K) by more than 1%
    Continue creating array? yes
    mdadm: Defaulting to version 1.2 metadata
    mdadm: array /dev/md42 started.
    $ cat /proc/mdstat
    md42 : active raid1 loop2p1[0]
          102400 blocks super 1.2 [2/1] [U_]
    
    md99 : active raid1 loop1p1[2] loop0p1[0]
          100224 blocks super 1.2 [2/2] [UU]
    
  • encrypt it, mkfs, copy data (use your preferred cipher and settings, LVM, filesystem, copy methods, ...)

    $ cryptsetup luksFormat /dev/md42
    $ cryptsetup luksOpen /dev/md42 luksmd42
    $ mkfs.ext4 /dev/mapper/luksmd42
    $ mount /dev/md99 /mnt/old
    $ mount /dev/md42 /mnt/new
    $ rsync -aAHSX /mnt/old/. /mnt/neW/.
    $ umount /mnt/old /mnt/new
    
  • Now your data is redundant on the old RAID 1 (unencrypted), and the new non-redundant RAID 1 (encrypted). But at this point you have to break redundancy in order to add the 2x1TB disks to the new RAID 1.

    $ mdadm --stop /dev/md99
    mdadm: stopped /dev/md99
    
  • At this point you could also wipe the 2x1TB disks if you like to waste your time. It's not necessary as the RAID-1 will sync the random data from the 2TB disk over to the 1TB disks.

    $ shred -n 1 /dev/loop0 &
    $ shred -n 1 /dev/loop1 &
    $ wait # for shred
    $ parted /dev/loop0
    $ parted /dev/loop1
    
  • Combine 2x1TB using mdadm, either linear or 0, depending on your preference.

    $ mdadm /dev/md43 --create --level=0 --raid-devices=2 /dev/loop0p1 /dev/loop1p1
    $ cat /proc/mdstat
    md43 : active raid0 loop1p1[1] loop0p1[0]
          199680 blocks super 1.2 512k chunks
    
    md42 : active raid1 loop2p1[0]
          102400 blocks super 1.2 [2/1] [U_]
    
  • Add it to the RAID 1 and grow the RAID 1 at the same time.

     $ mdadm /dev/md42 --add /dev/md43
     mdadm: added /dev/md43
     $ mdadm /dev/md42 --grow --size=max
     mdadm: component size of /dev/md42 has been set to 199616K
     $ cat /proc/mdstat
     md43 : active raid0 loop1p1[1] loop0p1[0]
           199680 blocks super 1.2 512k chunks
    
     md42 : active raid1 md43[2] loop2p1[0]
           199616 blocks super 1.2 [2/2] [UU]
    
  • Grow the cryptsetup and filesystem also.

     $ cryptsetup resize luksmd42
     $ resize2fs /dev/mapper/luksmd42
     resize2fs 1.42.7 (21-Jan-2013)
     Resizing the filesystem on /dev/mapper/luksmd42 to 197568 (1k) blocks.
     The filesystem on /dev/mapper/luksmd42 is now 197568 blocks long.
    

Aaand you're done.

Edit:

Here's an example /etc/mdadm.conf to go with this setup (use mdadm --detail --scan to get a starting point):

ARRAY /dev/md43 metadata=1.2 UUID=b9f590d7:9984dad4:cb75131b:63bca165
ARRAY /dev/md42 metadata=1.2 UUID=3a70188d:9ecacda7:ac715e16:9402fc55

In particular:

  • no DEVICE lines (if you must have them, make sure it includes md* devices)
  • order of ARRAY lines is important - the RAID0 array must be built first, which happens if it's listed first in the file.
Related Question