To be clear, I expected trouble. The computer is an old HP Z820 (certainly no BIOS support for NVMe) with the latest 2018 BIOS update. The stick is a new(-ish?) Western Digital (Sandisk) model:
WD Black 500GB NVMe SSD – M.2 2280 – WDS500G2X0C
Mounted on a PCIe 3.0 x4 card:
Mailiya M.2 PCIe to PCIe 3.0 x4 Adapter
I am not trying to boot from NVMe, just use for storage. Linux does see the drive (via lsblk and lspci) and can read … but not write.
This is Ubuntu 18.04.2 LTS with the kernel version:
Linux brutus 4.15.0-46-generic #49-Ubuntu SMP Wed Feb 6 09:33:07 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
(Also tested on 18.10.)
Pulled the Linux sources for this version, and for the current 5.0 Linux (from torvalds/linux on Github). There are substantial differences in driver/nvme between Ubuntu LTS and current, with updates as recent(!) as yesterday (2019.03.16 in "cd drivers/nvme ; git log").
Like I said at the start, expecting trouble. 🙂
Should mention I am slightly familiar with Linux device drivers, having written one of moderate complexity.
Tried compiling the current Linux 5.0 sources, and "rmmod nvme ; insmod nvme" – which did not work (no surprise). Tried copying the 5.0 nvme driver into the 4.15 tree and compiling – which did not work (also no surprise, but hey, got to try).
Next exercise would be to boot off the current Linux 5.0 kernel. But might as well put this in public, in case someone else is further.
Reads seen to work, but slower than expected:
# hdparm -t --direct /dev/nvme0n1
/dev/nvme0n1:
Timing O_DIRECT disk reads: 4840 MB in 3.00 seconds = 1612.83 MB/sec
# dd bs=1M count=8192 if=/dev/nvme0n1 of=/dev/null
8192+0 records in
8192+0 records out
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 4.57285 s, 1.9 GB/s
Writes fail badly:
# dd bs=1M count=2 if=/dev/zero of=/dev/nvme0n1
(hangs)
From journalctl:
Mar 17 18:49:23 brutus kernel: nvme nvme0: async event result 00010300
Mar 17 18:49:23 brutus kernel: print_req_error: I/O error, dev nvme0n1, sector 0
Mar 17 18:49:23 brutus kernel: buffer_io_error: 118 callbacks suppressed
Mar 17 18:49:23 brutus kernel: Buffer I/O error on dev nvme0n1, logical block 0, lost async page write
[snip]
Mar 17 18:49:23 brutus kernel: print_req_error: I/O error, dev nvme0n1, sector 1024
Mar 17 18:49:23 brutus kernel: print_req_error: I/O error, dev nvme0n1, sector 3072
Poked around a bit with the "nvme" command line tool, but only guessing:
# nvme list -o json
{
"Devices" : [
{
"DevicePath" : "/dev/nvme0n1",
"Firmware" : "101140WD",
"Index" : 0,
"ModelNumber" : "WDS500G2X0C-00L350",
"ProductName" : "Unknown Device",
"SerialNumber" : "184570802442",
"UsedBytes" : 500107862016,
"MaximiumLBA" : 976773168,
"PhysicalSize" : 500107862016,
"SectorSize" : 512
}
]
FYI – lspci output:
03:00.0 Non-Volatile memory controller: Sandisk Corp Device 5002 (prog-if 02 [NVM Express])
Subsystem: Sandisk Corp Device 5002
Physical Slot: 1
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 37
NUMA node: 0
Region 0: Memory at de500000 (64-bit, non-prefetchable) [size=16K]
Region 4: Memory at de504000 (64-bit, non-prefetchable) [size=256]
Capabilities: [80] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [90] MSI: Enable- Count=1/32 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [b0] MSI-X: Enable+ Count=65 Masked-
Vector table: BAR=0 offset=00002000
PBA: BAR=4 offset=00000000
Capabilities: [c0] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 unlimited
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 256 bytes, MaxReadReq 1024 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L0s <256ns, L1 <8us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range B, TimeoutDis+, LTR+, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [150 v1] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [1b8 v1] Latency Tolerance Reporting
Max snoop latency: 0ns
Max no snoop latency: 0ns
Capabilities: [300 v1] #19
Capabilities: [900 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
PortCommonModeRestoreTime=255us PortTPowerOnTime=10us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
T_CommonMode=0us LTR1.2_Threshold=0ns
L1SubCtl2: T_PwrOn=10us
Kernel driver in use: nvme
Kernel modules: nvme
Heh. Credit where due. 🙂
preston@brutus:~/sources/linux/drivers/nvme$ git log . | grep -i 'wdc.com\|@sandisk' | sed -e 's/^.*: //' | sort -uf
Adam Manzanares <adam.manzanares@wdc.com>
Bart Van Assche <bart.vanassche@sandisk.com>
Bart Van Assche <bart.vanassche@wdc.com>
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Jeff Lien <jeff.lien@wdc.com>
Also tested with the current (2019.03.17) Linux kernel:
root@brutus:~# uname -a
Linux brutus 5.1.0-rc1 #1 SMP Mon Mar 18 01:03:14 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
root@brutus:~# pvcreate /dev/nvme0n1
/dev/nvme0n1: write failed after 0 of 4096 at 4096: Input/output error
Failed to wipe new metadata area at the start of the /dev/nvme0n1
Failed to add metadata area for new physical volume /dev/nvme0n1
Failed to setup physical volume "/dev/nvme0n1".
From the journal:
Mar 18 02:05:10 brutus kernel: print_req_error: I/O error, dev nvme0n1, sector 8 flags 8801
Mar 18 02:09:06 brutus kernel: print_req_error: I/O error, dev nvme0n1, sector 8 flags 8801
Mar 18 02:09:36 brutus kernel: print_req_error: I/O error, dev nvme0n1, sector 8 flags 8801
So … not working in any version of Linux (yet), it seems.
Best Answer
I don't know whether you're still having these issues, but I'll at least post this in case others run into it.
I have this same drive and use it is as my primary drive running 18.04. I've used the Windows firmware utility and haven't seen any updates to this point. I also tested the live environment for 19.04, which has the same freeze ups/failure to install I experienced with 18.04 and 18.10 so the issue seems to still be open.
The problem appears to be that the drive becomes unstable when it goes into low power states so the fix is to disable the low power modes via kernel boot parameter. I did this a few months back and have had zero problems on 18.04 since. This method should work on the new versions (18.10/19.04) as well, but it's a shame that it hasn't been fixed yet.
In the GRUB boot menu, press e to edit startup parameter. Add
nvme_core.default_ps_max_latency_us=5500
by the end of quiet splash Ctrl-x to boot up, the installer should detect this disk in partition step.After finishing finish installation, press shift while power on to enter GRUB again, add same kernel parameter
nvme_core.default_ps_max_latency_us=5500
, Ctrl-x to boot up. You will see Ubuntu boot up successfully, edit/etc/default/grub
, add parameternvme_core.default_ps_max_latency_us=5500
again, executesudo update-grub
. So that every time boot up will contain this parameter in the grub automatically, no more manually edit.https://community.wd.com/t/linux-support-for-wd-black-nvme-2018/225446/9