Linux – LVM: PV missing after reboot

linuxlvm

I have a server (Ubuntu 18.04) with a number of LVM Logical Volumes. After a routine reboot one of them does not come back. After some investigation this is where I am at:

  • the physical disk is an iSCSI device and it is seen by the kernel as /dev/sdc (no errors and the correct size)

  • lvmdiskscan -v sees the PV on /dev/sdc

 
> lvmdiskscan -v
...
/dev/sdc [ 72.76 TiB] LVM physical volume 
  • blkid returns the UUID that I can also find in the LVM configuration
...
/dev/sdc: UUID="fvUXXf-pVOF-EPnn-c8eg-tZ5S-iMVW-wsSFDy" TYPE="LVM2_member"
  • this entry lacks a PARTUUID entry that other LVs on the system have. Is this a clue? I could not quite connect this piece of info with anything that helps me further.

  • pvscan does not report /dev/sdc

  • pvdisplay does not seem to know about this PV

> pvdisplay /dev/sdc
Failed to find physical volume "/dev/sdc" 

Anybody who can point me in the right direction?

Edited to add the output of pvck -t

pvck -t /dev/sdc
  TEST MODE: Metadata will NOT be updated and volumes will not be (de)activated.
  Found label on /dev/sdc, sector 1, type=LVM2 001
  Found text metadata area: offset=4096, size=1044480

Also useful to know that this LV was originally made on Ubuntu 14.04, and has been running perfectly fine on Ubuntu 18.04.


Additional output from lvmdiskscan below. This output does not look different from what I get for other VGs on the system. LVs first appear to be orphan, then the they are associated with a VG, and they become usable. But for r3vg this does not happen.

 lvm[7852]: Reading label from device /dev/sdc
 lvm[7852]: Opened /dev/sdc RO O_DIRECT
 lvm[7852]: /dev/sdc: block size is 4096 bytes
 lvm[7852]: /dev/sdc: physical block size is 512 bytes
 lvm[7852]: /dev/sdc: lvm2 label detected at sector 1
 lvm[7852]: lvmcache /dev/sdc: now in VG #orphans_lvm2 (#orphans_lvm2) with 0 mda(s).
 lvm[7852]: /dev/sdc: PV header extension version 2 found
 lvm[7852]: /dev/sdc: Found metadata at 5632 size 1015 (in area at 4096 size 1044480) for r3vg (Rpn2x9KOivnVd3m6gM9Rf2p3SYkRFm00)
 lvm[7852]: lvmcache has no info for vgname "r3vg" with VGID Rpn2x9KOivnVd3m6gM9Rf2p3SYkRFm00.
 lvm[7852]: lvmcache has no info for vgname "r3vg".
 lvm[7852]: lvmcache /dev/sdc: now in VG r3vg with 1 mda(s).
 lvm[7852]: lvmcache /dev/sdc: VG r3vg: set VGID to Rpn2x9KOivnVd3m6gM9Rf2p3SYkRFm00.
 lvm[7852]: lvmcache /dev/sdc: VG r3vg: set creation host to leitrim.
 lvm[7852]: lvmcache /dev/sdc: VG r3vg: stored metadata checksum 0x54affad5 with size 1015.
 lvm[7852]: Closed /dev/sdc
 lvm[7852]: /dev/sdc: using cached size 156250918912 sectors
 lvm[7852]: /dev/sdc              [      72.76 TiB] LVM physical volume
 lvm[7852]: 7 disks
 lvm[7852]: 3 partitions
 lvm[7852]: 1 LVM physical volume whole disk
 lvm[7852]: 2 LVM physical volumes
 lvm[7852]: Setting global/notify_dbus to 1
 lvm[7852]: Completed: lvmdiskscan -dddddd

Best Answer

Does the LV become mountable if you do a sudo vgscan and sudo vgchange -ay? If those commands result in errors, you probably have a different problem and should probably add those error messages in your original post.

But if the LV becomes ready for mounting after those commands, read on...

The LVM logical volume pathname (e.g. /dev/mapper/vgNAME-lvNAME) in /etc/fstab alone won't give the system a clue that this particular filesystem cannot be mounted until networking and iSCSI have been activated.

Without that clue, the system will assume that filesystem is on a local disk and will attempt to mount it as early as possible, normally before networking has been activated, which will obviously fail with an iSCSI LUN. So you'll need to supply that clue somehow.

One way would be to add _netdev to the mount options for that filesystem in /etc/fstab. From this Ubuntu help page it appears to be supported on Ubuntu. This might actually also trigger a vgscan or similar detection of new LVM PVs (+ possibly other helpful stuff) just before the attempt to mount any filesystems marked with _netdev.

Another way would be to use the systemd-specific mount option x-systemd.requires=<iSCSI initiator unit name>. That should achieve the same thing, by postponing any attempts to mount that filesystem until the iSCSI initiator has been successfully activated.

When the iSCSI initiator activates, it will automatically make any configured LUNs available, and as they become available, LVM should auto-activate any VGs on them. So, once you get the mount attempt postponed, that should be enough.

The lack of PARTUUID is a clue that the disk/LUN does not have a GPT partition table. Since /dev/sdc is listed as TYPE="LVM2_member" it actually does not have any partition table at all. In theory, it should cause no problems for Linux, but I haven't personally tested an Ubuntu 18.04 system with iSCSI storage, so cannot be absolutely certain.


The problem with disks/LUNs with no partition table is that other operating systems won't recognize the Linux LVM header as a sign that the disk is in use, and will happily overwrite it with minimal prompting. If your iSCSI storage administrator has accidentally presented the storage LUN corresponding to your /dev/sdc to another system, this might have happened.

You should find the LVM configuration backup file in /etc/lvm/backup directory that corresponds to your missing VG, and read it to find the expected UUID of the missing PV. If it matches what blkid reports, ask your storage administrator to double-check his/her recent work for mistakes like described above. If it turns out the PV has been overwritten by some other system, any remaining data on the LUN is likely to be more or less corrupted and it would be best to restore it from backup... once you get a new, guaranteed-unconflicted LUN from your iSCSI admin.

If it turns out the actual UUID of /dev/sdc is different from expected, someone might have accidentally run a pvcreate -f /dev/sdc somehow. If that's the only thing that has been done, that's relatively easy to fix. (NOTE: check man vgcfgrestore chapter REPLACING PHYSICAL VOLUMES for updated instructions - your LVM tools may be newer than mine.) First restore the UUID:

pvcreate --restorefile /etc/lvm/backup/<your VG backup file> --uuid <the old UUID of /dev/sdc from the backup file> /dev/sdc

Then restore the VG configuration:

vgcfgrestore --file /etc/lvm/backup/<your VG backup file> <name of the missing VG>

After this, it should be possible to activate the VG, and if no other damage has been done, mount the filesystem after that.

Related Question