Ubuntu – RAID (mdadm) – What happens if drives are mismatched in size

mdadmraid

Question 1 – Before you answer with "it just takes the smaller disk", hear me out quick. My 3TB WD Reds come out to be 3001 GB in size. Let's say I set up a mirror via mdadm for sdb1 and sdc1 which span 100% of the drive. But suddenly, one of the drives fail. The replacement is a 3TB, weighing in at 3000 GB. What happens when I put in a drive that is smaller than the one currently existent on the array? I know with a new array using 3000 vs 3001, it would build the array to be 3000. But like I said, what about a current array @ 3001 and I add a smaller drive? Does it re-structure itself during the rebuild to be 3000 GB in size?

Question 2 – In the event that I can't add a 3000 GB to the array with an existing 3001 GB and it simply downsize to 3000… can I resize the 3001 down a bit?

Question 3 – Or, a better idea. What if I downsize my 3TB drive to be 2999 GB. That way whether the drive is short by 1 MB, 1 byte, 10 KB, doesn't matter, it'll always pick up the "smaller" drive @ 2999 GB.

Best Answer

I came across this answer by mistake, but in case anyone is curious, here's an answer supported by experiments.

The Short Version

Bonus Question: can I create an md(4) RAID array out of block devices of unequal size? Yes, but the RAID array will have the size of the smallest block device (plus some overheads for its own housekeeping). If device sizes aren't within 1% of each other, you get a warning.

Question 1: can I add to an existing md(4) RAID array a device smaller than the smallest current member? Nope, sorry. mdadm will flat out refuse to do that to protect your data.

Question 2: can you resize an existing md array? Yes (read the mdadm manpge!), but it may not be worth the effort. You'll have to back everything up, then resize the contents of the RAID device, then resize the device itself — all of this is quite prone to errors, miscalculations, and other things that'll cost you your data (painful experience talking).

It's not worth the risk and effort. If you have a new, blank disk, here's how to resize it and also keep between one and two copies of all your data intact at all times (assuming you have 2-disk RAID1):

  1. Create a new md(4) array on it (with one disk missing).
  2. Recreate the structure of the array contents (Crypto, LVM, partitions tables, any combination thereof, whatever floats your boat).
  3. Copy the data from the existing disk to the new one.
  4. Reboot, using the new disk.
  5. Wipe the old disk's partition table (or zero the md(4) superblock). If necessary, create the required partitions to match the scheme on he new disk.
  6. Add the old disk to the new array.
  7. Wait for the array members to sync. Have some coffee. Fly to Latin America and pick your own coffee beans, for that matter. :) (If you live in Latin America, fly to Africa instead).

Note: yes, this is the same technique 0xC0000022L described in his answer.

Question 3. What if the drive is 1G short? :) Don't worry about it. Chances are your replacement drive will be bigger. In fact, with a strategy like above it pays to get cheaper larger drives whenever one fails (or for a cheaper upgrade). You can get a progressive upgrade.

Experimental Proof

Experimental Setup

First, let's fake some block devices. We'll use /tmp/sdx and /tmp/sdy (each 100M), and /tmp/sdz (99M).

cd /tmp
dd if=/dev/zero of=sdx bs=1M count=100
sudo losetup -f sdx
dd if=/dev/zero of=sdy bs=1M count=100
sudo losetup -f sdy
dd if=/dev/zero of=sdz bs=1M count=99  # Here's a smaller one!
sudo losetup -f sdz

This sets up three files as three loopback block devices: /dev/loop0, /dev/loop1 and /dev/loop2, mapping to sdx, sdy and sdz respectively. Let's check the sizes:

sudo grep loop[012] /proc/partitions
   7        0     102400 loop0
   7        1     102400 loop1
   7        2     101376 loop2

As expected, we have two loop devices of exactly 100M (102400 KiB = 100 MiB) and one of 99M (exactly 99×1024 1K blocks).

Making a RAID Array out of Identically-Sized Devices

Here goes:

sudo mdadm  --create -e 1.2 -n 2 -l 1 /dev/md100 /dev/loop0 /dev/loop1
mdadm: array /dev/md100 started.

Check the size:

sudo grep md100 /proc/partitions
   9      100     102272 md100

This is precicely what we expect: one look at the mdadm manual reminds us that version 1.2 metadata take up 128K: 128 + 102272 = 102400. Now let's destroy it in preparation for the second experiment.

sudo mdadm --stop /dev/md100
sudo mdadm --misc --zero-superblock /dev/loop0
sudo mdadm --misc --zero-superblock /dev/loop1

Making a RAID Array out of Unequally Sized Devices

This time we'll use the small block device.

sudo mdadm  --create -e 1.2 -n 2 -l 1 /dev/md100 /dev/loop0 /dev/loop2
mdadm: largest drive (/dev/loop0) exceeds size (101248K) by more than 1%
Continue creating array? y
mdadm: array /dev/md100 started.

Well, we got warned, but the array was made. Let's check the size:

sudo grep md100 /proc/partitions
   9      100     101248 md100

What we get here is 101,248 blocks. 101248 + 128 = 101376 = 99 × 1024. The usable space is that of the smallest device (plus the 128K RAID metadata). Let's bring it all down again for our last experiment:

sudo mdadm --stop /dev/md100
sudo mdadm --misc --zero-superblock /dev/loop0
sudo mdadm --misc --zero-superblock /dev/loop2

And Finally: Adding a smaller Device to a Running Array

First, let's make a RAID1 array with just one of the 100M disks. The array will be degraded, but we don't really care. We just want a started array. The missing keywords is a placeholder that says ‘I don't have a device for you yet, start he array now and I'll add one later’.

sudo mdadm  --create -e 1.2 -n 2 -l 1 /dev/md100 /dev/loop0 missing

Again, let's check the size:

sudo grep md100 /proc/partitions
   9      100     102272 md100

Sure enough, it's 128K short of 102400 blocks. Adding the smaller disk:

sudo mdadm  --add /dev/md100 /dev/loop2
mdadm: /dev/loop2 not large enough to join array

Boom! It won't let us, and the error is very clear.

Related Question