Ubuntu – Re-initialise GRUB for non-bootable UEFI zfs 16.04 installation

bootuefizfs

I have a physical machine running Ubuntu 16.04 with a zfs root file system, installed following the instructions at https://github.com/zfsonlinux/zfs/wiki/Ubuntu-16.04-Root-on-ZFS

I installed the bootloader as per instructions:

5.5b For UEFI booting, install GRUB:

grub-install --target=x86_64-efi --efi-directory=/boot/efi \
      --bootloader-id=ubuntu --recheck --no-floppy

The system has been running fine for three months, with several reboots (mainly caused by power cuts).

The root pool is a three way mirror:

NAME                                                                STATE     READ WRITE CKSUM
rpool                                                               ONLINE       0     0     0
  mirror-0                                                          ONLINE       0     0     0
    ata-SAMSUNG_HM500JI_S1WFJ90S818624-part1                        ONLINE       0     0     0
    ata-ST3250820AS_5QE5BVW5-part1                                  ONLINE       0     0     0
    ata-GB0250C8045_9SF0R2RD-part1                                  ONLINE       0     0     0

Originally I also had a hot-spare included in the pool. This morning, I needed the hot spare for use in another server which had a failed device, so I used zpool remove to remove the hot spare from the system, powered down, and physically removed the spare.

Now the server won't boot.

I've tried selecting all the three remaining physical disks to boot from but no joy.

At present, I have rebooted using a live CD. Following the early steps in the installation instructions above, I can see all the zfs pools, so the data is all there. I think the next step might be to chroot into this zpool but am not sure how to do so given the different ROOT file systems etc.

I am guessing that the GRUB boot information, for some reason, was only installed onto the disk which was designated as "spare" and which I have now removed. The disk in question is now part of a zfs mirror on a different server so it is not possible to put it back.

My question is: what is the easiest way of re-initialising GRUB so that the server will boot? Do I need to chroot into the disk-based system or can this be done from the live-CD environment? If the former, how do I correctly mount the root pool?

Best Answer

Largely cribbed from the ZFS installation instructions at https://github.com/zfsonlinux/zfs/wiki/Ubuntu-16.04-Root-on-ZFS, here are the steps I used to get my system working again.

boot-repair did not work.

Step 1: Prepare The Install Environment

1.1 Boot the Ubuntu Live CD, select Try Ubuntu Without Installing, and open a terminal (press Ctrl-Alt-T).

1.2 Optional: Install the OpenSSH server in the Live CD environment: If you have a second system, using SSH to access the target system can be convenient.

$ sudo apt-get --yes install openssh-server

Set a password on the “ubuntu” (Live CD user) account:

$ passwd

Hint: You can find your IP address with ip addr show scope global. Then, from your main machine, connect with ssh ubuntu@IP.

1.3 Become root:

# sudo -i

1.4 Install ZFS in the Live CD environment:

# apt-add-repository universe
# apt update

(ignore errors about moving an old database out of the way)

# apt install --yes debootstrap gdisk zfs-initramfs

Step 2: Discover available ZFS pools

2.1 Check if ZFS pools are already imported

# zpool list
# zfs list 

2.2 If it says "no datasets available" then skip to Step 3. If either of those commands brings up a list of pools, we need to export the zfs pool so we can mount it in a different directory so we can chroot to it

# zpool export rpool

Step 3: Chroot into ZFS pool

3.1 Import pool to non-default location. The -N flag (don’t automatically mount) is necessary because otherwise the rpool root, and the rpool/root/UBUNTU pool, will both try to mount on /mnt

# zpool import -N -R /mnt rpool

3.2 Mount the root system

# zfs mount rpool/ROOT/ubuntu

3.3 Mount the remaining file systems

# zfs mount -a

3.4 Bind the virtual filesystems from the LiveCD environment to the new system and chroot into it:

# mount --rbind /dev  /mnt/dev
# mount --rbind /proc /mnt/proc
# mount --rbind /sys  /mnt/sys
# chroot /mnt /bin/bash --login

Note: This is using --rbind, not —bind.

Step 4: Re-initialise EFI partitions on all root pool components

4.1 Check the wildcard gets the correct root pool partitions:

# for i in /dev/disk/by-id/*ata*part3; do echo $i; done

4.2 Add an entry for /boot/efi for each disk to /etc/fstab for failover purposes in future:

# for i in /dev/disk/by-id/*ata*part3; \
      do mkdosfs -F 32 -n EFI ${i}; \
      echo PARTUUID=$(blkid -s PARTUUID -o value \
      ${i}) /boot/efi vfat defaults 0 1 >> /etc/fstab; done

4.3 Mount the first disk

# mount /dev/disk/by-id/scsi-SATA_disk1-part3 /boot/efi

4.4 Install grub

# grub-install --target=x86_64-efi --efi-directory=/boot/efi \
      --bootloader-id=ubuntu --recheck —no-floppy

4.5 Unmount the first partition

# umount /boot/efi

4.6 Mount the second disk

# mount /dev/disk/by-id/scsi-SATA_disk2-part3 /boot/efi

4.7 Install grub

# grub-install --target=x86_64-efi --efi-directory=/boot/efi \
      --bootloader-id=ubuntu-2 --recheck —no-floppy

4.8 Repeat steps 4.5 to 4.7 for each additional disk 4.9 For added insurance, do an MBR installation to each disk too

# grub-install /dev/disk/by-id/scsi-SATA_disk1
# grub-install /dev/disk/by-id/scsi-SATA_disk2

Step 5: Reboot

5.1 Quit from the chroot

# exit

5.2 Reboot

# reboot
Related Question