As a short introduction I'm going to tell you, what kind of setup I have and what I want to achieve. After that I try to explain my problem. Since this is my first question on U&L I have to apologize for it being possibly strange. It's fairly long so maybe new users of BTRFS or Debian can benefit from my experiences or even use it as somewhat kind of a tutorial.

Basic hardware setup

I've got a mainboard and a CPU out of the Haswell generation with some RAM and two SSD.

(Note: To make a proper diagnosis I got nothing more connected together except these parts. If you need further information about these parts please feel free to ask for more details.)


I want to install Debian Jessie onto both SSD with BTRFS in mode RAID 1 mirroring both drives, so that if one fails I can still continue with the other in a degraded mode.

What I achieved yet

Since Jessie is the testing at the moment, I started a clean install with its predecessor Wheezy because I got problems with installation of the testing branch in the past. During the install I partitioned the first drive (say A) manually the following way:

  • only one single partition (containing all future mount points using all available space on the drive),
  • formatting in BTRFS and
  • leaving the swap partition.

The installation succeeded on A as expected and after that I upgraded to Jessie (by editing /etc/apt/sources.list and doing apt-get update && dist-upgrade. I performed a reboot. It showed well having Kernel 3.14-2-amd64 now.

After that I started preparing the other drive (say B). I copied the partition table of A with sfdisk -d /dev/sda > /tmp/ and dumped it back to B with sfdisk /dev/sdb < /tmp/

So now I was ready to add the device and perform the conversion to a RAID 1 with the following commands:

mount /dev/sda1 /mnt
btrfs device add /dev/sdb1 /mnt
btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt

Until here everything succeeded as I see a RAID 1 with btrfs filesystem show containing /dev/sda1 alias A and /dev/sdb1 alias B. Additionally blkid lists both drives with having same UUID and different UUID_SUB.

I left the mount points in /etc/fstab almost untouched by only adding some mount options relevant to the SSD:

UUID=01234567-89ab-cdef-0123-456789abcdef / btrfs  defaults,noatime,nodiratime,discard  0   1

After a restart the system booted into the OS fully. The above commands btrfs filesystem show and blkid still show the same results. RAID 1 seems to be running.

My problem and my questions

Before a cold start of the system I unplugged B (simulating a full device error), so that only A was available and thought of BTRFS still starting up in degraded mode. But that's not the case. I'm getting errors of initramfs being not able to find the device.

  1. Is that the expected behaviour of BTRFS or is something wrong with my initramfs? (Maybe it's the way BTRFS is telling me one drive got destroyed while being offline.)

    a. Is it or is it not possible to start-up with only one drive A, plugin (hot-swap) a new drive C and resync online?

    b. On it's mentioned to first mount a failed device in degraded mode and then add the new one. Is that the only way?

    (Remark to myself: I will test plugging in another naked device C and look what happens. Maybe I can answer these questions by myself after working through a german source on

  2. How do I make drive B become also bootable (just in case the drive A is destroyed)? Is /boot synchronized? Do I only need to install GRUB onto device B with grub-install /dev/sdb or do I have to do something more?

  3. Is it useful to use the mount option ssd in fstab, because it seems like it's not as states it is enabled automatically? To me it seems like it's only necessary if the OS can't detect the SSD properly.

Best Answer

Answer to question 1 - How to start after one drive failing

I could restore the RAID 1 by doing the following steps:

  1. I took a somehow formatted drive (say C) and plugged it to the same SATA port where the defective drive B was before.

  2. After that I started the computer and in the boot menu I pressed e to edit the command before booting according to by the following way:

    a. I scrolled to the relevant start entry and located the following rows:

    set root='hd0,msdos1'
    if [ x$feature_platform_search_hint = xy ]; then
      search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1 --hint='hd0,msdos1'  01234567-89ab-cdef-0123-456789abcdef
      search --no-floppy --fs-uuid --set=root 01234567-89ab-cdef-0123-456789abcdef
    echo    'Loading Linux 3.14-2-amd64...'
    linux   /boot/vmlinuz-3.14-2-amd64 root=UUID=01234567-89ab-cdef-0123-456789abcdef ro  quiet

    b. Then I edited row 1 and changed the drive number to the working hard disk (in my case it remains hd0, if multiple drives are still plugged in it might be hd1):

    set root='hd0,msdos1'

    c. I deactivated rows 2 till 6 by making it a comment through adding a leading character #:

    #if [ x$feature_platform_search_hint = xy ]; then
    #  search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1 --hint='hd0,msdos1'  01234567-89ab-cdef-0123-456789abcdef
    #  search --no-floppy --fs-uuid --set=root 01234567-89ab-cdef-0123-456789abcdef

    d. After that I edited row 8 and inserted a root flag for degradation of the RAID (rootflags=degraded):

    linux   /boot/vmlinuz-3.14-2-amd64 root=UUID=01234567-89ab-cdef-0123-456789abcdef ro rootflags=degraded quiet

    e. By pressing the key F10 I selected the just edited entry. The system was starting.

  3. After booting the OS fully I had to add the new drive C to my RAID 1. I did it like mentioned on

    a. I mounted the still working drive A:

    mount -o degraded /dev/sda1 /mnt

    b. I added the new drive C:

    btrfs device add /dev/sdb1 /mnt

    c. After that I removed the old devices (in my case drive B):

    btrfs device delete missing /mnt
  4. Finally I checked if everything went well with the commands btrfs filesystem show, blkid and btrfs fi df /mnt as mentioned above in the question. both drives are having the same UUID but different UUID_SUB and are reported being in mode RAID 1.

    Congratulations, it worked!

Personal note

I treat the described behaviour of a failing initramfs as expected until someone else proves me wrong. Maybe it's a way to tell me, I should react carefully now because my disk crashed horribly - but that's just guessing.

Explanation regarding to the need for manual degradation

In the meanwhile I found an interesting discussion related to that topic on the linux kernel developers mailing list. Because of it's relevance I want to cite a passage written by Duncan, which I think is really important to know, especially for new users:

You should be able to mount a two-device btrfs raid1 filesystem with only a single device with the degraded mount option, tho I believe current kernels refuse a read-write mount in that case, so you'll have read-only access until you btrfs device add a second device, so it can do normal raid1 mode once again. [...] Meanwhile, since the degraded mount-opt is in fact a no-op if btrfs can actually find all components of the filesystem, some people choose to simply add degraded to their standard mount options (edit the grub config to add it at every boot), so they don't have to worry about it. However, that is NOT RECOMMENDED, as the accepted wisdom is that the failure to mount undegraded serves as a warning to the sysadmin that something VERY WRONG is happening, and that they need to fix it. They can then add degraded temporarily if they wish, in ordered to get the filesystem to mount and thus be able to boot, but adding the option routinely at every boot bypasses this important warning, and it's all too likely that an admin will thus ignore the problem (or not know about it at all) until too late.


Additional note: Although I have no swap partition(s) on the computer of my example I would like to encourage people, who are willed to have them, to read this very interesting mail I gave the link to, because it explains the usage of swap with BTRFS in RAID mode.

Answer to question 2 - How to make other drives bootable

As for what I know until now, using grub-install /dev/sdb (and even an additional update-grub) seems to be not enough. I will explain why I think so.

When I tried the reverse way by offline-unplugging drive A and only booting with drive B the following happened. The bootloader GRUB appeared and I did the same steps like in point 2 of question 1. Right after confirming with F10 the boot process immediatly stopped with a blank screen (I am talking of an active monitor, black background, no cursor). So obviously something is wrong here with the bootloader on drive B. (Remember: I've got a RAID 1 and can't boot from my second drive after the first drive "failing".)

I helped myself by doing a hard reset, plugged in drive A again (so A and B both present again) and booted into the OS. Because my drives A and B are absolutly identical I copied the whole MBR (containing the bootloader) from the working drive A to B in raw mode with dd if=/dev/sda of=/dev/sdb bs=512 count=1. I shutdown the computer, unplugged drive A like before and guess what happened? After performing the steps for degradation again I could finally manage to boot into the OS only from drive B.

I have to summarize that I still don't know whether this has to do with my partition table (MSDOS - not GPT) or the command grub-install in combination with BTRFS or something else. I also don't get the dimension of potential drawbacks my raw copy has compared to a grub-install. (Maybe someone could clarify this a bit in a comment underneath this answer.)

Please note, that I am still researching in that context and I will update this answer once again. I want to clear up more, but I need some more time working through the raw code of the MBR's sector layout of both drives and figure out whether the problem comes from the bootloader or even the disk identities.

Answer to question 3 - How to handle the mount option ssd

It depends on whether the mainboard is able to pass the drive situation correctly. As stated on BTRFS itself relies on values of the OS. Because of the fact that other modules in the OS may also depend on these values it is much better to check /sys/block/sdX/queue/rotational for its appropriate value (0: SSD, 1: HDD) in general. If the values fit leave the ssd option.

