Linux – Intel SSD DC P3600 1.2TB performance

linuxperformancessd

I just got a workstation with an Intel SSD DC P3600 1.2TB on an Asus X99-E WS motherboard. I started Ubuntu 15.04 from a live CD and ran the Disks (gnome-disks) application to benchmark the SSD. The disk is mounted under /dev/nvme0n1. I ran the default benchmark (using 100 samples of 10 MB each, sampled randomly from the entire disk) and the results are disappointing: average read rate is 720 MB/s, average write rate is 805 MB/s (greater than the read rate!?) and average access time is 0.12 ms. Furthermore, the only information about the disk that Disks shows is its size – there is no model name or any other info.

I am unable to connect this machine to the network before it is setup due to corporate policy, so I cannot use any diagnostics tools (I wanted to follow the official documentation) apart from what is preinstalled. The documentation states that The NVMe driver is preinstalled in Linux kernel 3.19 and Ubuntu 15.04 has 3.19.0-15-generic so that should not be the problem. The

dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct

command from the documentation gives me a write rate of about 620 MB/s and

hdparm -tT --direct /dev/nvme0n1

gives 657 MB/s O_DIRECT cached reads and 664 MB/s O_DIRECT disk reads.

I fixed the PCIe port the disk is connected to to a PCIe v3.0 slot in BIOS and do not use UEFI boot.

Edit 1:
The PC supplier connected the SSD to the mainboard using Hot-swap Backplane PCIe Combination Drive Cage Kit for P4000 Server Chassis FUP8X25S3NVDK
(2.5in NVMe SSD)
.

The device is physically plugged into a PCIe 3.0 x16 slot, but lspci under Centos 7 and Ubuntu 15.04 lists it as using PCIe 2.0 1.0 x4 (LnkSta is 2.5 GT/s which is the speed of PCIe v1.0):

[user@localhost ~]$ sudo lspci -vvv -s 6:0.0
06:00.0 Non-Volatile memory controller: Intel Corporation PCIe Data Center SSD (rev 01) (prog-if 02 [NVM Express])
    Subsystem: Intel Corporation DC P3600 SSD [2.5" SFF]
    Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0
    Interrupt: pin A routed to IRQ 40
    Region 0: Memory at fb410000 (64-bit, non-prefetchable) [size=16K]
    Expansion ROM at fb400000 [disabled] [size=64K]
    Capabilities: [40] Power Management version 3
        Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [50] MSI-X: Enable+ Count=32 Masked-
        Vector table: BAR=0 offset=00002000
        PBA: BAR=0 offset=00003000
    Capabilities: [60] Express (v2) Endpoint, MSI 00
        DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 <4us
            ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
        DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
            RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
            MaxPayload 256 bytes, MaxReadReq 512 bytes
        DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
        LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <4us, L1 <4us
            ClockPM- Surprise- LLActRep- BwNot-
        LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
        LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1-
             EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
    Capabilities: [100 v1] Advanced Error Reporting
        UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
        CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
        CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
        AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
    Capabilities: [150 v1] Virtual Channel
        Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
        Arb:    Fixed- WRR32- WRR64- WRR128-
        Ctrl:   ArbSelect=Fixed
        Status: InProgress-
        VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
            Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
            Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01
            Status: NegoPending- InProgress-
    Capabilities: [180 v1] Power Budgeting <?>
    Capabilities: [190 v1] Alternative Routing-ID Interpretation (ARI)
        ARICap: MFVC- ACS-, Next Function: 0
        ARICtl: MFVC- ACS-, Function Group: 0
    Capabilities: [270 v1] Device Serial Number 55-cd-2e-40-4b-fa-80-bc
    Capabilities: [2a0 v1] #19
    Kernel driver in use: nvme

Edit 2:

I tested the drive under Centos 7 and the performance is identical to what I got on Ubuntu. I have to mention that the official documentation states that Intel tested this SSD on Centos 6.7 which does not seem to exist. Instead, after 6.6 came Centos 7.

Another source of confusion: benchmark results vary depending on the physical PCIe slot I connect the drive to. Slots 1-3 give the described performance, while on slots 4-7 the SSD achieves 100 MB/s higher read speed.

The only other PCIe device in the computer is an EVGA Nvidia GT 210 GPU with 512 MB RAM which seems to be a PCIe 2.0 x16 device, however, its LnkSta indicates PCIe v1.0 (2.5 GT/s) x8:

[user@localhost ~]$ sudo lspci -vvv -s a:0.0
0a:00.0 VGA compatible controller: NVIDIA Corporation GT218 [GeForce 210] (rev a2) (prog-if 00 [VGA controller])
    Subsystem: eVga.com. Corp. Device 1313
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0
    Interrupt: pin A routed to IRQ 114
    Region 0: Memory at fa000000 (32-bit, non-prefetchable) [size=16M]
    Region 1: Memory at c0000000 (64-bit, prefetchable) [size=256M]
    Region 3: Memory at d0000000 (64-bit, prefetchable) [size=32M]
    Region 5: I/O ports at e000 [size=128]
    Expansion ROM at fb000000 [disabled] [size=512K]
    Capabilities: [60] Power Management version 3
        Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Address: 00000000fee005f8  Data: 0000
    Capabilities: [78] Express (v2) Endpoint, MSI 00
        DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
            ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
        DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
            RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
            MaxPayload 128 bytes, MaxReadReq 512 bytes
        DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
        LnkCap: Port #8, Speed 2.5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <1us, L1 <4us
            ClockPM+ Surprise- LLActRep- BwNot-
        LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk-
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
        LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
             EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
    Capabilities: [b4] Vendor Specific Information: Len=14 <?>
    Capabilities: [100 v1] Virtual Channel
        Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
        Arb:    Fixed- WRR32- WRR64- WRR128-
        Ctrl:   ArbSelect=Fixed
        Status: InProgress-
        VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
            Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
            Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01
            Status: NegoPending- InProgress-
    Capabilities: [128 v1] Power Budgeting <?>
    Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
    Kernel driver in use: nouveau

Edit 3:

I have now connected the workstation to the network, installed Intel's Solid-State Drive Data Center Tool (isdct) and updated the firmware but the benchmark results haven't changed. What is interesting is its output:

[user@localhost ~]$ sudo isdct show -a -intelssd 
ls: cannot access /dev/sg*: No such file or directory
- IntelSSD CVMD5130002L1P2HGN -
AggregationThreshold: 0
Aggregation Time: 0
ArbitrationBurst: 0
AsynchronousEventConfiguration: 0
Bootloader: 8B1B012F
DevicePath: /dev/nvme0n1
DeviceStatus: Healthy
EnduranceAnalyzer: 17.22 Years
ErrorString: 
Firmware: 8DV10151
FirmwareUpdateAvailable: Firmware is up to date as of this tool release.
HighPriorityWeightArbitration: 0
Index: 0
IOCompletionQueuesRequested: 30
IOSubmissionQueuesRequested: 30
LBAFormat: 0
LowPriorityWeightArbitration: 0
ProductFamily: Intel SSD DC P3600 Series
MaximumLBA: 2344225967
MediumPriorityWeightArbitration: 0
MetadataSetting: 0
ModelNumber: INTEL SSDPE2ME012T4
NativeMaxLBA: 2344225967
NumErrorLogPageEntries: 63
NumLBAFormats: 6
NVMePowerState: 0
PCILinkGenSpeed: 1
PCILinkWidth: 4
PhysicalSize: 1200243695616
PowerGovernorMode: 0 (25W)
ProtectionInformation: 0
ProtectionInformationLocation: 0
RAIDMember: False
SectorSize: 512
SerialNumber: CVMD5130002L1P2HGN
SystemTrimEnabled: 
TempThreshold: 85 degree C
TimeLimitedErrorRecovery: 0
TrimSupported: True
WriteAtomicityDisableNormal: 0

Specifically, it lists PCILinkGenSpeed as 1 and PCILinkWidth as 4. I haven't found out what NVMePowerState of 0 means.

My questions:

  1. How do I make the SSD run at PCIe v3.0 x4 speed?

Best Answer

This is a hardware issue.

Hot-swap Backplane PCIe Combination Drive Cage Kit for P4000 Server Chassis FUP8X25S3NVDK (2.5in NVMe SSD) seems to be incompatible with the Asus X99-E WS motherboard. The solution is to connect the SSD using Asus HyperKit. However, this solution requires a cable between the HyperKit and the SSD which is not bundled with any of those and also not available for purchase at this time. Such a cable is bundled with the Intel SSD 750 Series (2.5' form factor) and our supplier was able to deliver one as a special service.

Beware of hardware incompatibility issues.

Related Question