Understanding M.2 Protocols – A Comprehensive Guide

hard drivem2-connectornvmesatassd

(Some things I say in this question are false. Don't forget to read the accepted answer.)

First I want to say that there is no SATA software protocol for data transmission. SATA CD drives, SATA HDDs and SATA SSDs use SCSI as software protocol.

Many people don't know that and I have even seen accepted answers on SE which are not aware of it. Also on German Wikipedia they compare AHCI to NVMe. They should compare SCSI to NVMe instead. It's a big mistake that AHCI is the software protocol used by SATA to transfer data.

Old IDE drives use ATAPI which is SCSI over ATA. So even those used SCSI.

AHCI is only used by SATA controllers to enumerate the disks. Not by disks to transfer data.

See this comment about ATAPI and AHCI.

I have a normal SATA SSD and SATA DVD drive and the output of lshw proofes that both use SCSI as software protocol. See lines beginning with "bus info:".

*-cdrom
    description: DVD-RAM writer
    product: BD-RE  BH16NS55
    vendor: HL-DT-ST
    physical id: 0
    bus info: scsi@4:0.0.0
    logical name: /dev/cdrom
    logical name: /dev/cdrw
    logical name: /dev/dvd
    logical name: /dev/dvdrw
    logical name: /dev/sr0
    version: 1.01
    capabilities: removable audio cd-r cd-rw dvd dvd-r dvd-ram
    configuration: ansiversion=5 status=nodisc
*-disk
    description: ATA Disk
    product: SanDisk SDSSDH3
    physical id: 1
    bus info: scsi@5:0.0.0
    logical name: /dev/sda
    version: 20RL
    serial: 2140LR450907
    size: 931GiB (1TB)
    capabilities: partitioned partitioned:luks
    configuration: ansiversion=5 logicalsectorsize=512 sectorsize=512

Now there is the M.2 connector which supports PCIe, SATA and USB cards (see Wikipedia). These 3 types are the types used on the physical layer. They define the voltage for example. They say nothing about the software protocol.

  1. Does a SATA M.2 disk talk AHCI or is SCSI enough to transmit the capacity for example?
  2. SATA uses AHCI for enumeration and SCSI for data. A PCIe NVMe disk uses NVMe for data. Which protocol does it use for enumeration? It cannot be NVMe because PCIe graphic cards don't talk NVMe for example. There must be another protocol.
  3. As Wikipedia says M.2 also supports USB. Which software protocol is used for enumeration and which is used for data?
  4. Why are there no M.2 USB disks? (Disks which use USB on the physical layer and are connected to the mainboard's M.2 slot. I am not talking about USB M.2 cases.)
  5. Why are there no M.2 SATA NVMe disks? (SATA on the physical layer and NVMe as software protocol.)

Best Answer

First I want to say that there is no SATA software protocol for data transmission.

That's partially correct, in that the software protocol is named "ATA" instead of "SATA" (which is only a specific physical layer).

However, ATA does exist as a protocol completely distinct from SCSI and its specifications such as the ATA command set can be found in various places, e.g. from the T13 Technical Committee.

SATA CD drives, SATA HDDs and SATA SSDs use SCSI as software protocol.

They don't. Most of them use ATA as the protocol, except for CD/DVD drives which use SCSI-over-ATA (aka ATAPI).

(SATA devices may be connected to a SAS HBA, but they don't switch to SCSI mode even then – it is the HBA that implements the SATA physical layer and the ATA command set alongside SAS/SCSI.)

Old IDE drives use ATAPI which is SCSI over ATA. So even those used SCSI.

Only CD/DVD drives used ATAPI; the rest were purely ATA.

I have a normal SATA SSD and SATA DVD drive and the output of lshw proofes that both use SCSI as software protocol. See lines beginning with "bus info:".

No, what it shows is Linux using the libata driver which presents ATA devices to the kernel as if they were SCSI devices. Libata is documented as providing "SCSI<->ATA translation for ATA devices according to the T10 SAT specification".

The specification in question is "SCSI to ATA Command Translations", a document by the T10 Technical Committee that describes how such translation can be implemented. The same kind of translation is also used by USB-to-SATA bridges. (T10 has also defined the ability to send pass-through ATA commands for ATA-specific features; this is how features such as ATA SMART are accessed through USB-to-SATA bridges.)

This is a "relatively new" change – booting a Linux kernel from ~2000 would have shown the same disk as an IDE device "/dev/hda" instead. Similarly, almost any non-Linux OS would still show IDE/ATA/SATA devices as distinct from SCSI.

(For a short time, Linux also had similar translation of NVMe into SCSI but this was soon removed in favor of a purely NVMe interface. It seems there are considerably more differences between NVMe and SCSI than there were between ATA and SCSI.)

Does a SATA M.2 disk talk AHCI or is SCSI enough to transmit the capacity for example?

SATA disks generally don't talk AHCI at all; that's only the interface used between the OS and the SATA host controller (the "HBA"). It stands for "Advanced Host Controller Interface".

An M.2-form-factor SATA disk could include its own AHCI host controller (which then uses PCIe over the M.2 connector), but that's fairly rare. Most of the time, M.2-form SATA devices just use the M.2-provided SATA lanes to the motherboard's existing AHCI host controller.

SATA uses AHCI for enumeration and SCSI for data.

No, if it were an SCSI disk, then the same SCSI would be used for enumeration (using the SCSI "INQUIRY" command or its ATA equivalent) and data transfer. SATA disks are not SCSI disks but the same applies; the ATA command set also includes enumeration.

AHCI is also involved but in a different place: it is used to enumerate the host controller itself, and to exchange ATA commands and data with the host controller over the PCI "bus".

(In other words, the ATA commands are sent via the AHCI interface from host to controller, then via SATA protocol from controller to disk.)

Also on German Wikipedia they compare AHCI to NVMe. They should compare SCSI to NVMe instead

AHCI and NVMe are comparable to some extent, as both present a specific programming interface on the PCI bus (just like AHCI can be compared with the legacy I/O-port IDE programming interface).

In other words, NVMe defines both the command layer and the host interface, therefore it is comparable to the combination of SATA + AHCI, or to the combination of SCSI + whatever SCSI HBA interface exists.

A PCIe NVMe disk uses NVMe for data. Which protocol does it use for enumeration? It cannot be NVMe because PCIe graphic cards don't talk NVMe for example. There must be another protocol.

Yes, there are two layers of enumeration.

PCI itself indeed has its own bus enumeration protocol which informs the host OS of the device class and product ID (allowing the right driver to be attached); this protocol is independent of the device type. (As far as I know, the high-level mechanism is mostly the same for PCI Express as it was in classic PCI, despite the low-level details being very different.)

After that, each higher-level protocol driver performs its own enumeration specific to that protocol – e.g. if the device is detected as an NVMe controller, then NVMe commands are issued to query its "disk specific" parameters; if it's detected as an AMD GPU, then the AMD GPU driver does its own thing.

Why are there no M.2 USB disks? (Disks which use USB on the physical layer and are connected to the mainboard's M.2 slot. I am not talking about USB M.2 cases.)

Most likely there is no need for them, as SATA (and NVMe even more so) is already available on the same slot and tends to offer better performance than USB Mass Storage would.

Most "USB" disks (with few exceptions) are already not natively USB/SCSI but actually SATA disks internally which use a SATA-to-USB bridge. For such a disk to be connected to an M.2 slot, it would be pointless to use a SATA-to-USB bridge if the same disk can just directly use the existing SATA lanes of the M.2 slot.

Though natively USB Mass Storage disks do exist (and perhaps even offer decent performance with UASP) but they only make sense when USB is the only option – native SATA still makes more sense for low-performance disks; native NVMe for high-performance ones.

Why are there no M.2 SATA NVMe disks? (SATA on the physical layer and NVMe as software protocol.)

Again, because there are better options.

Remember that M.2 slots are not "SATA or NVMe"; they are "SATA or PCIe", allowing any kind of PCIe device to be connected to them… including an AHCI host controller.

So even if this product were aimed at M.2 slots that are PCIe-only (i.e. not having any SATA connections), it would still be much simpler – and cheaper – to include a standard SATA AHCI host controller than a more expensive and less reliable SATA-to-NVMe translator (which internally must contain a SATA host controller anyway).

(And if the M.2 slot already offers SATA lanes, both options are more expensive than directly wiring the same disk to the mainboard's SATA AHCI controller.)

Related Question