Centos – LVM volume is inactive after reboot of CentOS

centoslvmrhel

I've reinstalled a Linux server from CentOS 6 to 7. The server has 3 drives – a system SSD drive (it hosts everything except /home) and two 4TB HDD drives that host /home. Everything uses LVM. The two 4TB drives are mirrored (using the raid option within LVM itself), and they are completely filled with the /home partition.

The problem is that although the 4TB disks are recognized fine, and LVM sees the volume without problems, it does not activate it automatically. Everything else is activated automatically. I can activate it manually, and it works.

I have an image of the old system drive in /home. That contains LVM volumes too. If I mount it with kpartx, and LVM picks those up and activates them. But I can see no difference between those volumes and the inactive ones.

The root filesystem is LVM too, and that activates just fine.

I see a peculiar thing though: executing lvchange -aay tells me that I need to specify which drives I want to activate. It doesn't do it automatically either. If I specify lvchange -ay lv_home – that works.

I cannot find anything that could be responsible for this behavior.

Added: I noticed that the old system (which used init) had vgchange -aay --sysinit in its startup scripts. The new one uses systemd, and I don't see the vgchange call in its scripts. But I also don't know where to put it.

Added 2: Starting to figure out systemd. I found where the scripts are located and started understanding how they are called. Also found that I could see the executed scripts with systemctl -al. This shows me that after starting lvmetad it calls pvscan for each known udev block device. However at that point there is just one registered udev block device, and that is one of the recognized lvm volumes. The hard drives are there too, but under different paths and much longer names. The recognized block device is something like 8:3, while the hard drives are like /device/something/. I'm not at the server anymore, so I cannot write it precisely (will fix this later).

I think that it has something to do with udev and device detection/mapping. I will continue in the evening and will study udev then.

If all else fails, I found the script that calls pvscan and checked that I can modify it to scan all the devices all the time. That fixes the problem, but it looks like a rather ugly hack, so I'll try to figure out the real root cause.

Added 3: OK, I still don't know why this happens, but at least I've made a fairly passable workaround. I made another systemd service that calls the pvscan once, right after starting lvmetad. The other call for the specific device is still there, and I think it's actually udev that calls it (that's the only place where I found reference to it). Why it doesn't call it for the other hard drives – I have no idea.

Best Answer

I did it! I did it! I fixed it properly (I think).

Here's the story:

After some time the server turned out to be faulty and had to be scrapped. I kept disks and got everything else new. Then I reinstalled CentOS again on the SSD and then I attached the HDDs. LVM worked nicely, the disks were recognized, the configuration kept. But the same problem came up again - after a reboot, the volume was inactive.

However this time I chanced to notice something else - the bootloader passes the following parameters to the kernel:

crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet

Hmm, wait a minute, those look FAMILIAR!

Quick google query, and there we are:

rd.lvm.lv=

only activate the logical volumes with the given name. rd.lvm.lv can be specified multiple times on the kernel command line.

Well now. THAT explains it!

So, the resolution was (gathered from several more google queries):

  1. Modify /etc/defaults/grub to include the additional volume in the parameters: crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rd.lvm.lv=vg_home/lv_home rhgb quiet
  2. Reconfigure grub with grub2-mkconfig -o /boot/grub2/grub.cfg
  3. Reconfigure initramfs with mkinitrd -f -v /boot/initramfs-3.10.0-327.18.2.el7.x86_64.img 3.10.0-327.18.2.el7.x86_64. Note: your values may vary. Use uname -r to get that kernel version. Or just read up on mkinitrd. (Frankly, I don't know why this step is needed, but apparently it is - I tried without it and it didn't work)
  4. And finally, reinstall grub: grub2-install /dev/sda
  5. Reboot, naturally.

TA-DA! The volume is active on reboot. Add it to fstab and enjoy! :)

Related Question