Mysql – Autoincrement vs composite primary key for a big table on innodb

clustered-primary-keyindexinnodbMySQLprimary-key

I have a had a quite lengthy discussion with Rick James on this this we came out with idea of having composite key to replace the autoincrement pk where the int is limited close 2 billion. My table will reach this limit in few months easily as monthly we are capturing close to few hundred million data. Below is how my table looks like. The key table is the gdata so I composite the primary using 3 fields PRIMARY KEY (alarmTypeID,vehicleID,gDateTime). Then I have another table called alarm table. The link between both is one to many. Meaning one data in gdata can have zero or more alarms related to it. The link between them is vehicleID and gDateTime.

CREATE TABLE `gdata` ( 
    `alarmTypeID` tinyint(4) NOT NULL DEFAULT '0', 
    `fleetID` smallint(11) NOT NULL, 
    `fleetGroupID` smallint(11) DEFAULT NULL, 
    `fleetSubGroupID` smallint(11) DEFAULT NULL, 
    `deviceID` mediumint(11) NOT NULL, 
    `vehicleID` mediumint(11) NOT NULL, 
    `gDateTime` datetime NOT NULL, 
    `insertDateTime` datetime NOT NULL, 
    `latitude` float NOT NULL, 
    `longitude` float NOT NULL, 
    `speed` smallint(11) NOT NULL
     -- (see full text) 
) ;
ALTER TABLE `gdata` 
    ADD PRIMARY KEY (`alarmTypeID`,`vehicleID`,`gDateTime`), 
    ADD KEY `gDateTime` (`gDateTime`), 
    ADD KEY `fleetID` (`fleetID`,`vehicleID`,`gDateTime`); 
COMMIT; 

Here is the alarm table

CREATE TABLE `alarm` (
    `alarmTypeID` tinyint(4) NOT NULL, 
    `vehicleID` mediumint(9) NOT NULL, 
    `gDateTime` datetime NOT NULL, 
    `insertDateTime` datetime NOT NULL, 
    `alarmValue` varchar(5) NOT NULL, 
    `readWeb` enum('n','y') NOT NULL DEFAULT 'n', 
    `readWebDateTime` datetime NOT NULL, 
    `readMobile` enum('n','y') NOT NULL DEFAULT 'n', 
    `readMobileDateTim` datetime NOT NULL 
) ENGINE=InnoDB DEFAULT CHARSET=latin1; 

ALTER TABLE `alarm` 
    ADD PRIMARY KEY (`alarmTypeID`,`vehicleID`,`gDateTime`); 
COMMIT;

All looks good but recently I was doing some googling on related topic and found that some discussion https://www.quora.com/Is-it-a-bad-idea-to-have-a-primary-key-on-3-or-more-columns are against composite primary key and would prefer to go with autoincrement mainly for insert purpose. Can some one shed more light on this to maintain composite key of primary or move back to autoincrement?

Best Answer

There is nothing wrong with a composite key. However, you have to take into account how InnoDB stores data.

Quoting the above linked documentation:

The data in each InnoDB table is divided into pages. The pages that make up each table are arranged in a tree data structure called a B-tree index. Table data and secondary indexes both use this type of structure. The B-tree index that represents an entire table is known as the clustered index, which is organized according to the primary key columns. The nodes of the index data structure contain the values of all the columns in that row (for the clustered index) or the index columns and the primary key columns (for secondary indexes).

That is, InnoDB will store the data according to your PRIMARY KEY. If the data you're inserting has an increasing PK, page fragmentation does not occur. That will happen always with an AUTO_INCREMENT. If you're inserting the data in chronological order (i.e. gDateTime is always monotonically increasing), changing the order of the columns that make out your PK to:

PRIMARY KEY (`gDateTime`, `alarmTypeID`, `vehicleID`)

... will have the same advantages, with regard to not having to "fit a new row in the middle of others" (which means, the B-tree isn't fragmented for every insert).

However: if you reference this table from other (related) tables, you have to store, in the referencing table always the PK (gDateTime, alarmTypeID, vehicleID). This means you're saving every time 7 or 8 bytes of storage. The composite PK would use 2 + 1 + 8 = 11 bytes of information (probably it uses 12 bytes due to alignment); whereas an INT UNSIGNED AUTO_INCREMENT, you'll use only 4 bytes in the referencing table. You're limited to a 2^32 different values for your PK. If you need more than 2^32 values, you'll need BIGINT AUTO_INCREMENT, that gives you 2^64 (and I haven't found yet a practical case where this isn't big enough).

Whether this makes sense or not, depends a lot on your particular scenario.