Understanding M.2 Protocols – A Comprehensive Guide

hard drivem2-connectornvmesatassd

(Some things I say in this question are false. Don't forget to read the accepted answer.)

First I want to say that there is no SATA software protocol for data transmission. SATA CD drives, SATA HDDs and SATA SSDs use SCSI as software protocol.

Many people don't know that and I have even seen accepted answers on SE which are not aware of it. Also on German Wikipedia they compare AHCI to NVMe. They should compare SCSI to NVMe instead. It's a big mistake that AHCI is the software protocol used by SATA to transfer data.

Old IDE drives use ATAPI which is SCSI over ATA. So even those used SCSI.

AHCI is only used by SATA controllers to enumerate the disks. Not by disks to transfer data.

See this comment about ATAPI and AHCI.

I have a normal SATA SSD and SATA DVD drive and the output of lshw proofes that both use SCSI as software protocol. See lines beginning with "bus info:".

*-cdrom
    description: DVD-RAM writer
    product: BD-RE  BH16NS55
    vendor: HL-DT-ST
    physical id: 0
    bus info: scsi@4:0.0.0
    logical name: /dev/cdrom
    logical name: /dev/cdrw
    logical name: /dev/dvd
    logical name: /dev/dvdrw
    logical name: /dev/sr0
    version: 1.01
    capabilities: removable audio cd-r cd-rw dvd dvd-r dvd-ram
    configuration: ansiversion=5 status=nodisc
*-disk
    description: ATA Disk
    product: SanDisk SDSSDH3
    physical id: 1
    bus info: scsi@5:0.0.0
    logical name: /dev/sda
    version: 20RL
    serial: 2140LR450907
    size: 931GiB (1TB)
    capabilities: partitioned partitioned:luks
    configuration: ansiversion=5 logicalsectorsize=512 sectorsize=512

Now there is the M.2 connector which supports PCIe, SATA and USB cards (see Wikipedia). These 3 types are the types used on the physical layer. They define the voltage for example. They say nothing about the software protocol.

Does a SATA M.2 disk talk AHCI or is SCSI enough to transmit the capacity for example?
SATA uses AHCI for enumeration and SCSI for data. A PCIe NVMe disk uses NVMe for data. Which protocol does it use for enumeration? It cannot be NVMe because PCIe graphic cards don't talk NVMe for example. There must be another protocol.
As Wikipedia says M.2 also supports USB. Which software protocol is used for enumeration and which is used for data?
Why are there no M.2 USB disks? (Disks which use USB on the physical layer and are connected to the mainboard's M.2 slot. I am not talking about USB M.2 cases.)
Why are there no M.2 SATA NVMe disks? (SATA on the physical layer and NVMe as software protocol.)

Best Answer

First I want to say that there is no SATA software protocol for data transmission.

That's partially correct, in that the software protocol is named "ATA" instead of "SATA" (which is only a specific physical layer).

However, ATA does exist as a protocol completely distinct from SCSI and its specifications such as the ATA command set can be found in various places, e.g. from the T13 Technical Committee.

SATA CD drives, SATA HDDs and SATA SSDs use SCSI as software protocol.

They don't. Most of them use ATA as the protocol, except for CD/DVD drives which use SCSI-over-ATA (aka ATAPI).

(SATA devices may be connected to a SAS HBA, but they don't switch to SCSI mode even then – it is the HBA that implements the SATA physical layer and the ATA command set alongside SAS/SCSI.)

Old IDE drives use ATAPI which is SCSI over ATA. So even those used SCSI.

Only CD/DVD drives used ATAPI; the rest were purely ATA.

I have a normal SATA SSD and SATA DVD drive and the output of lshw proofes that both use SCSI as software protocol. See lines beginning with "bus info:".

No, what it shows is Linux using the libata driver which presents ATA devices to the kernel as if they were SCSI devices. Libata is documented as providing "SCSI<->ATA translation for ATA devices according to the T10 SAT specification".

The specification in question is "SCSI to ATA Command Translations", a document by the T10 Technical Committee that describes how such translation can be implemented. The same kind of translation is also used by USB-to-SATA bridges. (T10 has also defined the ability to send pass-through ATA commands for ATA-specific features; this is how features such as ATA SMART are accessed through USB-to-SATA bridges.)

This is a "relatively new" change – booting a Linux kernel from ~2000 would have shown the same disk as an IDE device "/dev/hda" instead. Similarly, almost any non-Linux OS would still show IDE/ATA/SATA devices as distinct from SCSI.

(For a short time, Linux also had similar translation of NVMe into SCSI but this was soon removed in favor of a purely NVMe interface. It seems there are considerably more differences between NVMe and SCSI than there were between ATA and SCSI.)

Does a SATA M.2 disk talk AHCI or is SCSI enough to transmit the capacity for example?

SATA disks generally don't talk AHCI at all; that's only the interface used between the OS and the SATA host controller (the "HBA"). It stands for "Advanced Host Controller Interface".

An M.2-form-factor SATA disk could include its own AHCI host controller (which then uses PCIe over the M.2 connector), but that's fairly rare. Most of the time, M.2-form SATA devices just use the M.2-provided SATA lanes to the motherboard's existing AHCI host controller.

SATA uses AHCI for enumeration and SCSI for data.

No, if it were an SCSI disk, then the same SCSI would be used for enumeration (using the SCSI "INQUIRY" command or its ATA equivalent) and data transfer. SATA disks are not SCSI disks but the same applies; the ATA command set also includes enumeration.

AHCI is also involved but in a different place: it is used to enumerate the host controller itself, and to exchange ATA commands and data with the host controller over the PCI "bus".

(In other words, the ATA commands are sent via the AHCI interface from host to controller, then via SATA protocol from controller to disk.)

Also on German Wikipedia they compare AHCI to NVMe. They should compare SCSI to NVMe instead

AHCI and NVMe are comparable to some extent, as both present a specific programming interface on the PCI bus (just like AHCI can be compared with the legacy I/O-port IDE programming interface).

In other words, NVMe defines both the command layer and the host interface, therefore it is comparable to the combination of SATA + AHCI, or to the combination of SCSI + whatever SCSI HBA interface exists.

A PCIe NVMe disk uses NVMe for data. Which protocol does it use for enumeration? It cannot be NVMe because PCIe graphic cards don't talk NVMe for example. There must be another protocol.

Yes, there are two layers of enumeration.

PCI itself indeed has its own bus enumeration protocol which informs the host OS of the device class and product ID (allowing the right driver to be attached); this protocol is independent of the device type. (As far as I know, the high-level mechanism is mostly the same for PCI Express as it was in classic PCI, despite the low-level details being very different.)

After that, each higher-level protocol driver performs its own enumeration specific to that protocol – e.g. if the device is detected as an NVMe controller, then NVMe commands are issued to query its "disk specific" parameters; if it's detected as an AMD GPU, then the AMD GPU driver does its own thing.

Why are there no M.2 USB disks? (Disks which use USB on the physical layer and are connected to the mainboard's M.2 slot. I am not talking about USB M.2 cases.)

Most likely there is no need for them, as SATA (and NVMe even more so) is already available on the same slot and tends to offer better performance than USB Mass Storage would.

Most "USB" disks (with few exceptions) are already not natively USB/SCSI but actually SATA disks internally which use a SATA-to-USB bridge. For such a disk to be connected to an M.2 slot, it would be pointless to use a SATA-to-USB bridge if the same disk can just directly use the existing SATA lanes of the M.2 slot.

Though natively USB Mass Storage disks do exist (and perhaps even offer decent performance with UASP) but they only make sense when USB is the only option – native SATA still makes more sense for low-performance disks; native NVMe for high-performance ones.

Why are there no M.2 SATA NVMe disks? (SATA on the physical layer and NVMe as software protocol.)

Again, because there are better options.

Remember that M.2 slots are not "SATA or NVMe"; they are "SATA or PCIe", allowing any kind of PCIe device to be connected to them… including an AHCI host controller.

So even if this product were aimed at M.2 slots that are PCIe-only (i.e. not having any SATA connections), it would still be much simpler – and cheaper – to include a standard SATA AHCI host controller than a more expensive and less reliable SATA-to-NVMe translator (which internally must contain a SATA host controller anyway).

(And if the M.2 slot already offers SATA lanes, both options are more expensive than directly wiring the same disk to the mainboard's SATA AHCI controller.)

Resizing a Partition

Partitions can't be resized, but they can be deleted and then recreated. When a partition is deleted, the underlying data is still in tact. It's not too difficult to delete and recreate a partition, but the calculation must be done exactly right, or the filesystem inside the partition will be corrupted by misalignment or undersizing.

I don't normally prefer using GUIs, but resizing partitions using the command line is prone to human error, factoring in the partition table (usually msdos or gpt), the beginning of the partition, the end of the partition, and the right size.

WARNING: Before proceeding, take a backup of your XFS filesystem using this procedure (where /dev/sdg1 is your XFS filesystem and /path/to/backup.xfs is where you want to store your XFS dump):
mount /dev/sdg1 /mnt
xfsdump -f /path/to/backup.xfs -L MySession -M MyMedia /mnt
If something goes wrong, you can restore to a new XFS partition:
mount /dev/sdg1 /mnt # … where /dev/sdg1 is a new XFS partition
xfsrestore -f /path/to/backup.xfs /mnt

Easy Way

GParted does all the calculations for you:

It's very self-explanatory, and it even expands the XFS filesystem to fit.
This is generally a safe procedure.

`fdisk` Way

Use fdisk to delete and recreate the partition. Full example:

root@node53 [~]# fdisk /dev/sdg

Welcome to fdisk (util-linux 2.25.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): p
Disk /dev/sdg: 991.5 MiB, 1039663104 bytes, 2030592 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: FAFC7A8C-52CB-4FF2-9746-391D50BF729C

Device     Start     End Sectors  Size Type
/dev/sdg1   2048 1050623 1048576  512M Linux filesystem

Note the "Start" position (the 2048^th sector in this example). You will need to type this in as the first sector when you recreate the partition.

Command (m for help): d
Selected partition 1
Partition 1 has been deleted.

Command (m for help): n
Partition number (1-128, default 1): 1
First sector (34-2030558, default 2048): 2048
Last sector, +sectors or +size{K,M,G,T,P} (2048-2030558, default 2030558): 2030558

fdisk will default to using the largest contiguous free space. (In this example, it's the 2030558^th sector.)

Created a new partition 1 of type 'Linux filesystem' and of size 990.5 MiB.

Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

Now you have a larger partition which contains a smaller XFS filesystem. These commands would expand the XFS filesystem:

root@node53 [~]# mount -v /dev/sdg1 /mnt
mount: /dev/sdg1 mounted on /mnt.

root@node53 [~]# xfs_growfs /mnt
meta-data=/dev/sdg1              isize=256    agcount=4, agsize=32768 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=131072, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal               bsize=4096   blocks=853, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 131072 to 253563

Boom, you've got an expanded XFS partition:

root@node53 [~]# df -hT /mnt
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/sdg1      xfs   988M   26M  962M   3% /mnt

`xfsdump` Way (only way to shrink XFS)

Did you take a backup when I told you to? Yes? Good! I prefer to use xfsrestore to restore xfsdumps onto new partitions. The advantage is that you can actually shrink XFS filesystems using this method, but the downside is that all the data need to be rewritten, which is slower.

You can actually use the fdisk method above to recreate the partition. After exiting fdisk, do this instead:

root@node53 [~]# mkfs.xfs -f /dev/sdg1
meta-data=/dev/sdg1              isize=256    agcount=4, agsize=63391 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=253563, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=853, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
root@node53 [~]# mount -v /dev/sdg1 /mnt
mount: /dev/sdg1 mounted on /mnt.
root@node53 [~]# xfsrestore -f /path/to/backup.xfs /mnt
xfsrestore: using file dump (drive_simple) strategy
xfsrestore: version 3.1.4 (dump format 3.0) - type ^C for status and control
xfsrestore: searching media for dump
xfsrestore: examining media file 0
xfsrestore: dump description: 
xfsrestore: hostname: andie
xfsrestore: mount point: /mnt
xfsrestore: volume: /dev/sdg1
xfsrestore: session time: Mon Nov 16 14:44:20 2015
xfsrestore: level: 0
xfsrestore: session label: "MySession"
xfsrestore: media label: "MyMedia"
xfsrestore: file system id: c5981472-9b75-4fad-9bd8-d1bd04086f8d
xfsrestore: session id: 092b0cf3-120d-43c1-b8ce-23300abf558e
xfsrestore: media id: 3cc0f4db-665f-40fd-ac54-493625f712f5
xfsrestore: using online session inventory
xfsrestore: searching media for directory dump
xfsrestore: reading directories
xfsrestore: 1 directories and 0 entries processed
xfsrestore: directory post-processing
xfsrestore: restore complete: 0 seconds elapsed
xfsrestore: Restore Summary:
xfsrestore:   stream 0 /path/to/backup.xfs OK (success)
xfsrestore: Restore Status: SUCCESS

Best Answer

Related Solutions

Windows – Tool to retrieve information about which hard-disk is connected to which disk-controller (Windows)

How to grow xfs formated disk

Resizing a Partition

Easy Way

fdisk Way

xfsdump Way (only way to shrink XFS)

Related Question

`fdisk` Way

`xfsdump` Way (only way to shrink XFS)