Create a 2 TB RAID-1 array out of one 2 TB disk and two 1 TB disks while preserving 1 TB of data

mdadmraidsoftware-raid

I currently have 1 TB of data on a RAID-1 mdadm array. This array consists of one 1 TB disk and a 1 TB partition on a 2 TB disk. I have now bought a second 1 TB disk, and I would like to use this to create a 2 TB RAID-1 array.

The data on this array will be encrypted with dm-crypt, so I want to overwrite each disk with data from /dev/urandom before putting data on it.

The question is if I can create this 2 TB RAID-1 array, and securely overwrite each disk with random data while preserving the 1 TB of data I have on my current 1 TB RAID-1 array.

So to recap: I have 2 x 1 TB disks and 1 x 2 TB disk. I want to create a 2 TB RAID-1 array where one side of the mirror consists of two 1 TB disks pooled together, and the other side of the mirror is the whole 2 TB disk. Also, I want to preserve 1 TB of data using only these disks, while still being able to overwrite each disk with random data.

Best Answer

As for overwriting each disk with random data, it's redundant. Since you're going to build a new encrypted RAID, the resync will overwrite everything anyway.

As for the method of overwriting, /dev/urandom is horribly slow, people who try to use it for wiping terabytes usually cancel halfway through because it just takes way too long. Encrypting the device with a random key and then wiping that with /dev/zero is faster, and shred -n 1 is much faster still. So if you must have random data on your disk, I recommend you use those methods instead.

Now to your RAID, I'd do the following: (in this example loop2 is the 2TB disk)

Add the new 1TB disk partition to the RAID-1. Wait until the sync is finished. This way, your 1TB of data spans three disks.

$ mdadm /dev/md99 --grow --raid-devices=3 --add /dev/loop1p1
mdadm: added /dev/loop1p1
raid_disks for /dev/md99 set to 3
$ cat /proc/mdstat
md99 : active raid1 loop1p1[2] loop2p1[1] loop0p1[0]
      100224 blocks super 1.2 [3/3] [UUU]

remove the 2TB disk from the RAID-1 array. Your 1TB is still preserved redundantly on the two 1TB disks.

$ mdadm /dev/md99 --fail /dev/loop2p1
mdadm: set /dev/loop2p1 faulty in /dev/md99
$ mdadm /dev/md99 --remove /dev/loop2p1
mdadm: hot removed /dev/loop2p1 from /dev/md99
$ mdadm /dev/md99 --grow --raid-devices=2
raid_disks for /dev/md99 set to 2
$ cat /proc/mdstat
md99 : active raid1 loop1p1[2] loop0p1[0]
      100224 blocks super 1.2 [2/2] [UU]

wipe the 2TB disk
```
$ shred -n 1 /dev/loop2
```
repartition the disk to 2TB. Note that you will need a boot partition if you have no other boot device, as you can not boot from encrypted devices.
```
$ parted /dev/loop2
```

create a new RAID-1 array using that partition, and missing for the 2nd device. The 2nd device will be added later on. Also, the size will be limited for now, to be grown later, as we're not sure of the size of the 2x1TB disks at this point. (If you are sure, feel free to use a different size here, but you won't be able to add the 2x1TB later if you make it too large).

$ mdadm /dev/md42 --create --level=1 --raid-devices=2 --size=1000G /dev/loop2p1 missing
mdadm: largest drive (/dev/loop2p1) exceeds size (102400K) by more than 1%
Continue creating array? yes
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md42 started.
$ cat /proc/mdstat
md42 : active raid1 loop2p1[0]
      102400 blocks super 1.2 [2/1] [U_]

md99 : active raid1 loop1p1[2] loop0p1[0]
      100224 blocks super 1.2 [2/2] [UU]

encrypt it, mkfs, copy data (use your preferred cipher and settings, LVM, filesystem, copy methods, ...)

$ cryptsetup luksFormat /dev/md42
$ cryptsetup luksOpen /dev/md42 luksmd42
$ mkfs.ext4 /dev/mapper/luksmd42
$ mount /dev/md99 /mnt/old
$ mount /dev/md42 /mnt/new
$ rsync -aAHSX /mnt/old/. /mnt/neW/.
$ umount /mnt/old /mnt/new

Now your data is redundant on the old RAID 1 (unencrypted), and the new non-redundant RAID 1 (encrypted). But at this point you have to break redundancy in order to add the 2x1TB disks to the new RAID 1.
```
$ mdadm --stop /dev/md99
mdadm: stopped /dev/md99
```
At this point you could also wipe the 2x1TB disks if you like to waste your time. It's not necessary as the RAID-1 will sync the random data from the 2TB disk over to the 1TB disks.
```
$ shred -n 1 /dev/loop0 &
$ shred -n 1 /dev/loop1 &
$ wait # for shred
$ parted /dev/loop0
$ parted /dev/loop1
```

Combine 2x1TB using mdadm, either linear or 0, depending on your preference.

$ mdadm /dev/md43 --create --level=0 --raid-devices=2 /dev/loop0p1 /dev/loop1p1
$ cat /proc/mdstat
md43 : active raid0 loop1p1[1] loop0p1[0]
      199680 blocks super 1.2 512k chunks

md42 : active raid1 loop2p1[0]
      102400 blocks super 1.2 [2/1] [U_]

Add it to the RAID 1 and grow the RAID 1 at the same time.

 $ mdadm /dev/md42 --add /dev/md43
 mdadm: added /dev/md43
 $ mdadm /dev/md42 --grow --size=max
 mdadm: component size of /dev/md42 has been set to 199616K
 $ cat /proc/mdstat
 md43 : active raid0 loop1p1[1] loop0p1[0]
       199680 blocks super 1.2 512k chunks

 md42 : active raid1 md43[2] loop2p1[0]
       199616 blocks super 1.2 [2/2] [UU]

Grow the cryptsetup and filesystem also.

 $ cryptsetup resize luksmd42
 $ resize2fs /dev/mapper/luksmd42
 resize2fs 1.42.7 (21-Jan-2013)
 Resizing the filesystem on /dev/mapper/luksmd42 to 197568 (1k) blocks.
 The filesystem on /dev/mapper/luksmd42 is now 197568 blocks long.

Aaand you're done.

Edit:

Here's an example /etc/mdadm.conf to go with this setup (use mdadm --detail --scan to get a starting point):

ARRAY /dev/md43 metadata=1.2 UUID=b9f590d7:9984dad4:cb75131b:63bca165
ARRAY /dev/md42 metadata=1.2 UUID=3a70188d:9ecacda7:ac715e16:9402fc55

In particular:

no DEVICE lines (if you must have them, make sure it includes md* devices)
order of ARRAY lines is important - the RAID0 array must be built first, which happens if it's listed first in the file.

Related Solutions

Create a software RAID 1 with one device

The simple answer to the question in the title is "Yes". But what you really want to do is the next step, which is getting the existing data mirrored.

It's possible to convert the existing disk, but it's risky, as mentioned, due the the metadata location. Much better to create an empty (broken) mirror with the new disk and copy the existing data onto it. Then, if it doesn't work, you just boot back to the un-mirrored original.

First, initialize /dev/sdb1 as the new /dev/md0 with a missing drive and initialize the filesystem (I'm assuming ext3, but the choice is yours)

mdadm --create --verbose /dev/md0 --level=mirror --raid-devices=2 /dev/sdb1 missing
mkfs -text3 /dev/md0

Now, /dev/sda1 is most likely your root file system (/) so for safety you should do the next step from a live CD, rescue disk or other bootable system which can access both /dev/sda1 and /dev/md0 although I have successfully done this by dropping to single user mode.

Copy the entire contents of the filesystem on /dev/sda1 to /dev/md0. For example:

mount /dev/sda1 /mnt/a       # only do this if /dev/sda1 isn't mounted as root
mount /dev/md0 /mnt/b
cd /mnt/a                    # or "cd /" if it's the root filesystem
cp -dpRxv . /mnt/b

Edit /etc/fstab or otherwise ensure that on the next boot, /dev/md0 is mounted instead of /dev/sda1. Your system is probably set to boot from /dev/sda1 and the boot parameters probably specify this as the root device, so when rebooting you should manually change this so that the root is /dev/md0 (assuming /dev/sda1 was root). After reboot, check that/dev/md0 is now mounted (df) and that it is running as a degraded mirror (cat /proc/mdstat). Add /dev/sda1 to the array:

mdadm /dev/md0 --add /dev/sda1

Since the rebuild will overwrite /dev/sda1, which metadata version you use is irrelevant. As always when making major changes, take a full backup (if possible) or at least ensure that anything which can't be recreated is safe.

You will need to regenerate your boot config to use /dev/md0 as root (if /dev/sda1 was root) and probably need to regenerate mdadm.conf to ensure /dev/md0 is always started.

Linux – RAID 5 with 4 disks fails to operate with one failed disk

This is a fundamental problem with RAID5—bad blocks on rebuild are a killer.

Oct  2 15:08:51 it kernel: [1686185.573233] md/raid:md0: device xvdc operational as raid disk 0
Oct  2 15:08:51 it kernel: [1686185.580020] md/raid:md0: device xvde operational as raid disk 2
Oct  2 15:08:51 it kernel: [1686185.588307] md/raid:md0: device xvdd operational as raid disk 1
Oct  2 15:08:51 it kernel: [1686185.595745] md/raid:md0: allocated 4312kB
Oct  2 15:08:51 it kernel: [1686185.600729] md/raid:md0: raid level 5 active with 3 out of 4 devices, algorithm 2
Oct  2 15:08:51 it kernel: [1686185.608928] md0: detected capacity change from 0 to 2705221484544
⋮

The array has been assembled, degraded. It has been assembled with xvdc, xvde, and xvdd. Apparently, there is a hot spare:

Oct  2 15:08:51 it kernel: [1686185.615772] md: recovery of RAID array md0
Oct  2 15:08:51 it kernel: [1686185.621150] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
Oct  2 15:08:51 it kernel: [1686185.627626] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
Oct  2 15:08:51 it kernel: [1686185.634024]  md0: unknown partition table
Oct  2 15:08:51 it kernel: [1686185.645882] md: using 128k window, over a total of 880605952k.

The 'partition table' message is unrelated. The other messages are telling you that md is attempting to do a recovery, probably on to a hot spare (which might be the device that failed out before, if you've attempted to remove/re-add it).

⋮
Oct  2 15:24:19 it kernel: [1687112.817845] end_request: I/O error, dev xvde, sector 881423360
Oct  2 15:24:19 it kernel: [1687112.820517] raid5_end_read_request: 1 callbacks suppressed
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: read error not correctable (sector 881423360 on xvde).
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: Disk failure on xvde, disabling device.
Oct  2 15:24:19 it kernel: [1687112.821837] md/raid:md0: Operation continuing on 2 devices.

And this here is md attempting to read a sector from xvde (one of the remaining three devices). That fails [bad sector, probably], and md (since the array is degraded) can not recover. It thus kicks the disk out of the array, and with a double-disk failure, your RAID5 is dead.

I'm not sure why its being labeled as a spare—that's weird (though, I guess I normally look at /proc/mdstat, so maybe that's just how mdadm labels it). Also, I thought newer kernels were much more hesitant to kick out for bad blocks—but maybe you're running something older?

What can you do about this?

Good backups. That's always an important part of any strategy to keep data alive.

Make sure that the array gets scrubbed for bad blocks routinely. Your OS may already include a cron job for this. You do this by echoing either repair or check to /sys/block/md0/md/sync_action. "Repair" will also repair any discovered parity errors (e.g., the parity bit doesn't match with the data on the disks).

# echo repair > /sys/block/md0/md/sync_action
#

Progress can be watched with cat /proc/mdstat, or the various files in that sysfs directory. (You can find somewhat up-to-date documentation at the Linux Raid Wiki mdstat article.

NOTE: On older kernels—not sure the exact version—check may not fix bad blocks.

One final option is to switch to RAID6. This will require another disk (you can run a four- or even three-disk RAID6, you probably don't want to). With new enough kernels, bad blocks are fixed on the fly when possible. RAID6 can survive two disk failures, so when one disk has failed, it can still survive a bad block—and thus it'll both map out the bad block and continue the rebuild.

Best Answer

Related Solutions

Create a software RAID 1 with one device

Linux – RAID 5 with 4 disks fails to operate with one failed disk

What can you do about this?

Related Question