Ubuntu – SSD vanishes after a few writes

nvmeserverssd

I just installed Ubuntu Server 20.04 to a new NVMe SSD, and after a few minutes, it ends up unmounting the SSD, and it no longer shows up in /dev. The / filesystem ends up in read-only, and I need to use SysRq to actually get it to restart. I have tried a clean install with all default settings, but it still happens.

I do usually get a few minutes before the issue occurs, and even a fairly long time if I leave it for a while before trying to write anything. I ran some commands, and got a dmesg output of the failure.

Please note that the other commands were run before the error, as it causes the drive to stop appearing in lsblk, etc.

dmesg

[ 2620.639990] nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10
[ 2620.708040] nvme 0000:04:00.0: enabling device (0000 -> 0002)
[ 2620.708314] nvme nvme0: Removing after probe failure status: -19
[ 2620.724017] blk_update_request: I/O error, dev nvme0n1, sector 297615936 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[ 2620.724040] EXT4-fs warning (device dm-0): ext4_end_bio:309: I/O error 10 writing to inode 11930801 (offset 0 size 0 starting block 36808264)
[ 2620.724044] Buffer I/O error on device dm-0, logical block 36808264
[ 2620.724064] blk_update_request: I/O error, dev nvme0n1, sector 385282048 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[ 2620.724071] EXT4-fs warning (device dm-0): ext4_end_bio:309: I/O error 10 writing to inode 11927953 (offset 0 size 0 starting block 47766528)
[ 2620.724073] Buffer I/O error on device dm-0, logical block 47766528
[ 2620.724081] blk_update_request: I/O error, dev nvme0n1, sector 3267592 op 0x1:(WRITE) flags 0x0 phys_seg 3 prio class 0
[ 2620.724092] blk_update_request: I/O error, dev nvme0n1, sector 213192456 op 0x1:(WRITE) flags 0x800 phys_seg 13 prio class 0
[ 2620.724101] EXT4-fs warning (device dm-0): ext4_end_bio:309: I/O error 10 writing to inode 11927953 (offset 0 size 0 starting block 14721)
[ 2620.724103] Buffer I/O error on device dm-0, logical block 14721
[ 2620.724109] Buffer I/O error on dev dm-0, logical block 26255329, lost sync page write
[ 2620.724112] Buffer I/O error on device dm-0, logical block 14722
[ 2620.724116] Buffer I/O error on device dm-0, logical block 14723
[ 2620.724133] blk_update_request: I/O error, dev nvme0n1, sector 3267576 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[ 2620.724140] EXT4-fs warning (device dm-0): ext4_end_bio:309: I/O error 10 writing to inode 11927953 (offset 0 size 0 starting block 14719)
[ 2620.724142] Buffer I/O error on device dm-0, logical block 14719
[ 2620.724149] blk_update_request: I/O error, dev nvme0n1, sector 3267552 op 0x1:(WRITE) flags 0x0 phys_seg 2 prio class 0
[ 2620.724155] EXT4-fs warning (device dm-0): ext4_end_bio:309: I/O error 10 writing to inode 11927953 (offset 0 size 0 starting block 14716)
[ 2620.724156] Buffer I/O error on device dm-0, logical block 14716
[ 2620.724160] Buffer I/O error on device dm-0, logical block 14717
[ 2620.724169] blk_update_request: I/O error, dev nvme0n1, sector 3267536 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[ 2620.724176] EXT4-fs warning (device dm-0): ext4_end_bio:309: I/O error 10 writing to inode 11927953 (offset 0 size 0 starting block 14714)
[ 2620.724177] Buffer I/O error on device dm-0, logical block 14714
[ 2620.724184] blk_update_request: I/O error, dev nvme0n1, sector 3267504 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[ 2620.724191] EXT4-fs warning (device dm-0): ext4_end_bio:309: I/O error 10 writing to inode 11927953 (offset 0 size 0 starting block 14710)
[ 2620.724192] Buffer I/O error on device dm-0, logical block 14710
[ 2620.724198] blk_update_request: I/O error, dev nvme0n1, sector 3267464 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[ 2620.724205] EXT4-fs warning (device dm-0): ext4_end_bio:309: I/O error 10 writing to inode 11927953 (offset 0 size 0 starting block 14705)
[ 2620.724212] blk_update_request: I/O error, dev nvme0n1, sector 3267448 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[ 2620.724219] EXT4-fs warning (device dm-0): ext4_end_bio:309: I/O error 10 writing to inode 11927953 (offset 0 size 0 starting block 14703)
[ 2620.724225] EXT4-fs warning (device dm-0): ext4_end_bio:309: I/O error 10 writing to inode 11927953 (offset 0 size 0 starting block 14650)
[ 2620.724408] Aborting journal on device dm-0-8.
[ 2620.724461] Buffer I/O error on dev dm-0, logical block 26247168, lost sync page write
[ 2620.724475] JBD2: Error -5 detected when updating journal superblock for dm-0-8.
[ 2620.724514] Buffer I/O error on dev dm-0, logical block 0, lost sync page write
[ 2620.724531] EXT4-fs (dm-0): I/O error while writing superblock
[ 2620.724536] EXT4-fs error (device dm-0): ext4_journal_check_start:61: Detected aborted journal
[ 2620.724541] EXT4-fs (dm-0): Remounting filesystem read-only
[ 2620.724553] Buffer I/O error on dev dm-0, logical block 0, lost sync page write
[ 2620.724557] EXT4-fs (dm-0): I/O error while writing superblock
[ 2620.751840] nvme nvme0: failed to set APST feature (-19)
[ 2620.772163] systemd-journald[404]: /var/log/journal/459895d5abee48eea9619d2bc4bcb34d/user-1000.journal: IO error, rotating.
[ 2620.772194] systemd-journald[404]: Failed to rotate /var/log/journal/459895d5abee48eea9619d2bc4bcb34d/system.journal: Read-only file system
[ 2620.772218] systemd-journald[404]: Failed to rotate /var/log/journal/459895d5abee48eea9619d2bc4bcb34d/user-1000.journal: Read-only file system
[ 2620.772381] systemd-journald[404]: Failed to write entry (31 items, 797 bytes) despite vacuuming, ignoring: Input/output error
[ 2620.779360] Buffer I/O error on dev nvme0n1p2, logical block 131072, lost sync page write
[ 2620.779369] JBD2: Error -5 detected when updating journal superblock for nvme0n1p2-8.
[ 2620.779373] Aborting journal on device nvme0n1p2-8.
[ 2620.779376] Buffer I/O error on dev nvme0n1p2, logical block 131072, lost sync page write
[ 2620.779380] JBD2: Error -5 detected when updating journal superblock for nvme0n1p2-8.
[ 2620.781339] systemd-journald[404]: Failed to rotate /var/log/journal/459895d5abee48eea9619d2bc4bcb34d/system.journal: Read-only file system
[ 2620.781362] systemd-journald[404]: Failed to rotate /var/log/journal/459895d5abee48eea9619d2bc4bcb34d/user-1000.journal: Read-only file system
[ 2620.781521] systemd-journald[404]: Failed to write entry (31 items, 789 bytes), ignoring: Input/output error
[ 2622.040977] EXT4-fs error (device dm-0): __ext4_find_entry:1531: inode #11927865: comm bash: reading directory lblock 0
[ 2622.041007] Buffer I/O error on dev dm-0, logical block 0, lost sync page write
[ 2622.041013] EXT4-fs (dm-0): I/O error while writing superblock
[ 2622.536800] audit: type=1400 audit(1600337586.312:62): apparmor="DENIED" operation="open" profile="snap.nextcloud.import" name="/home/max/20200916-103505/apps/" pid=144284 comm="rsync" requested_mask="r" denied_mask="r" fsuid=0 ouid=1000
max@ubuntu:~$

lspci

00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Root Complex
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Renoir IOMMU
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
00:02.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge
00:02.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge
00:02.3 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus
00:08.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 51)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 5
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 6
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 7
01:00.0 USB controller: ASMedia Technology Inc. ASM1042A USB 3.0 Host Controller
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0e)
02:00.1 Serial controller: Realtek Semiconductor Co., Ltd. Device 816a (rev 0e)
02:00.2 Serial controller: Realtek Semiconductor Co., Ltd. Device 816b (rev 0e)
02:00.3 IPMI Interface: Realtek Semiconductor Co., Ltd. Device 816c (rev 0e)
02:00.4 USB controller: Realtek Semiconductor Co., Ltd. Device 816d (rev 0e)
03:00.0 Network controller: Intel Corporation Wi-Fi 6 AX200 (rev 1a)
04:00.0 Non-Volatile memory controller: Sandisk Corp Device 5009 (rev 01)
05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Renoir (rev c3)
05:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device 1637
05:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor
05:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1
05:00.4 USB controller: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1
05:00.5 Multimedia controller: Advanced Micro Devices, Inc. [AMD] Raven/Raven2/FireFlight/Renoir Audio Processor (rev 01)
05:00.6 Audio device: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) HD Audio Controller
05:00.7 Signal processing controller: Advanced Micro Devices, Inc. [AMD] Raven/Raven2/Renoir Sensor Fusion Hub
06:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 81)
06:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 81)

lsblk

loop0                       7:0    0  97.1M  1 loop /snap/core/9993
loop1                       7:1    0  55.3M  1 loop /snap/core18/1885
loop2                       7:2    0   9.1M  1 loop /snap/canonical-livepatch/95
loop3                       7:3    0  71.3M  1 loop /snap/lxd/16099
loop4                       7:4    0    55M  1 loop /snap/core18/1880
loop5                       7:5    0  70.6M  1 loop /snap/lxd/16922
loop6                       7:6    0  29.9M  1 loop /snap/snapd/8542
loop7                       7:7    0  30.3M  1 loop /snap/snapd/9279
loop8                       7:8    0 258.1M  1 loop /snap/nextcloud/23171
nvme0n1                   259:0    0 931.5G  0 disk
├─nvme0n1p1               259:1    0   512M  0 part /boot/efi
├─nvme0n1p2               259:2    0     1G  0 part /boot
└─nvme0n1p3               259:3    0   930G  0 part
  └─ubuntu--vg-ubuntu--lv 253:0    0   200G  0 lvm  /

inxi

System:
  Host: ubuntu Kernel: 5.4.0-47-generic x86_64 bits: 64 Console: tty 0
  Distro: Ubuntu 20.04.1 LTS (Focal Fossa)
Machine:
  Type: Mini-pc System: ASUSTeK product: MINIPC PN50 v: 0409
  serial: L7MRCG001401NZ3
  Mobo: ASUSTeK model: PN50 serial: N/A UEFI: ASUSTeK v: 0409
  date: 06/30/2020
CPU:
  Topology: 6-Core model: AMD Ryzen 5 4500U with Radeon Graphics bits: 64
  type: MCP L2 cache: 3072 KiB
  Speed: 1397 MHz min/max: 1400/2375 MHz Core speeds (MHz): 1: 1397 2: 1394
  3: 1397 4: 1397 5: 1397 6: 1397
Graphics:
  Device-1: AMD Renoir driver: N/A
  Display: server: No display server data found. Headless machine?
  tty: 80x40
  Message: Advanced graphics data unavailable for root.
Audio:
  Device-1: AMD driver: snd_hda_intel
  Device-2: AMD Raven/Raven2/FireFlight/Renoir Audio Processor driver: N/A
  Device-3: AMD Family 17h HD Audio driver: snd_hda_intel
  Sound Server: ALSA v: k5.4.0-47-generic
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
  driver: r8169
  IF: enp2s0f0 state: up speed: 1000 Mbps duplex: full
  mac: 24:4b:fe:2d:c4:2d
  Device-2: Intel Wi-Fi 6 AX200 driver: iwlwifi
  IF: wlp3s0 state: down mac: 14:f6:d8:7b:b4:20
Drives:
  Local Storage: total: 931.51 GiB used: 12.02 GiB (1.3%)
  ID-1: /dev/nvme0n1 vendor: Western Digital model: WDS100T2B0C-00PXH0
  size: 931.51 GiB
Partition:
  ID-1: / size: 195.86 GiB used: 11.91 GiB (6.1%) fs: ext4 dev: /dev/dm-0
  ID-2: /boot size: 975.9 MiB used: 103.8 MiB (10.6%) fs: ext4
  dev: /dev/nvme0n1p2
Sensors:
  System Temperatures: cpu: 56.9 C mobo: N/A
  Fan Speeds (RPM): N/A
Info:
  Processes: 190 Uptime: 6m Memory: 30.86 GiB used: 602.8 MiB (1.9%)
  Shell: bash inxi: 3.0.38

Has anyone had this issue before? Or know if this is a bug, or a hardware issue?

Best Answer

heynnema was correct. The SSD just needed to be reseated, which fixed the issue. Thanks.

Related Question