Mysql – How to Speed Up the SQL Query

greatest-n-per-groupMySQLoptimizationperformancequery-performance

I have a query that is taking much longer than I expect it should, and I can not figure out how to speed it up.

For a little bit of background on the table structure, these are audited records. There's a parent record, which has many child records. If a parent record is modified, an audit is written for it (but not for the children). If a child record is modified, an audit record is written for both the parent and the child (but not the other children).

Both of these tables have an integer ID and a revision number. The child table also has a column for the parent's ID. The way these are accessed generally is for the parent record to be selected, then the children records with the most recent revisions to the parent's revision number are brought in.

For example, if parent A has a revision of 100, and child B has records with revisions of 101, 98, and 90. Revision 98 will be brought in for child B.

Now, this is usually done on a per-parent basis, and the query is pretty quick. I'm trying to make a similar query that will give me all of the relevant IDs and revision numbers for each child. This is my query –

SELECT c.id AS cId, MAX(c.REV) AS cRev FROM Parent_AUD p 
INNER JOIN Child_AUD c ON p.id = c.parent_id AND p.REV >= c.REV
GROUP BY p.id, p.REV, c.id

This takes over 20 minutes to run (I've never actually let it finish, so I'm not sure how long exactly it takes). There's about 2 millions parent records and 13 million child records, which I don't think warrants that much time.

This is the output of EXPLAIN –

+------+-------------+--------+-------+----------------------------------------------------+-----------------------------+---------+----------------------+---------+----------------------------------------------+
| id   | select_type | table  | type  | possible_keys                                      | key                         | key_len | ref                  | rows    | Extra                                        |
+------+-------------+--------+-------+----------------------------------------------------+-----------------------------+---------+----------------------+---------+----------------------------------------------+
|    1 | SIMPLE      | parent | index | PRIMARY,FK_PARENT_REV                              | FK_PARENT_REV               | 4       | NULL                 | 2150722 | Using index; Using temporary; Using filesort |
|    1 | SIMPLE      | child  | ref   | FK_CHILD_REV,parent_id_index                       | parent_id_index             | 9       | db.p.id              |      68 | Using where; Using index                     |
+------+-------------+--------+-------+----------------------------------------------------+-----------------------------+---------+----------------------+---------+----------------------------------------------+

I have single-column indexes on the parent ID, the child ID, the parent revision number, the child revision number, and the parent ID within the child table.

Does anybody have any ideas on how I can speed this up?

EDIT: This is the output of SHOW CREATE TABLE for both tables (with unrelevant column names changed) –

Parent_AUD | CREATE TABLE `Parent_AUD` (
  `id` bigint(20) NOT NULL,
  `REV` int(11) NOT NULL,
  `REVTYPE` tinyint(4) DEFAULT NULL,
  `dateModified` datetime(6) DEFAULT NULL,
  `colOne` longblob DEFAULT NULL,
  `colTwo` int(11) DEFAULT NULL,
  `colThree` bigint(20) DEFAULT NULL,
  `colFour` bigint(20) DEFAULT NULL,
  `colFive` varchar(255) DEFAULT NULL,
  `colSix` tinyint(1) DEFAULT NULL,
  `colSeven` varchar(255) DEFAULT NULL,
  `colEight` bigint(20) DEFAULT NULL,
  `colNine` tinyint(1) DEFAULT NULL,
  `colTen` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`,`REV`),
  KEY `FK_PARENT_REV` (`REV`),
  CONSTRAINT `FK_PARENT_REV` FOREIGN KEY (`REV`) REFERENCES `REVINFO` (`REV`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Child_AUD | CREATE TABLE `Child_AUD` (
  `id` bigint(20) NOT NULL,
  `REV` int(11) NOT NULL,
  `REVTYPE` tinyint(4) DEFAULT NULL,
  `colOne` longblob DEFAULT NULL,
  `colTwo` longblob DEFAULT NULL,
  `colThree` bit(1) DEFAULT NULL,
  `colFour` int(11) DEFAULT NULL,
  `colFive` longblob DEFAULT NULL,
  `parent_id` bigint(20) DEFAULT NULL,
  `colSix` varchar(255) DEFAULT NULL,
  `colSeven` tinyint(1) DEFAULT NULL,
  `colEight` longblob DEFAULT NULL,
  `colNine` tinyint(1) DEFAULT NULL,
  `colTen` tinyint(1) DEFAULT NULL,
  `dateModified` datetime(6) DEFAULT NULL,
  PRIMARY KEY (`id`,`REV`),
  KEY `FK_CHILD_REV` (`REV`),
  KEY `parent_id_index` (`parent_id`),
  CONSTRAINT `FK_CHILD_REV` FOREIGN KEY (`REV`) REFERENCES `REVINFO` (`REV`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Best Answer

KEY `parent_id_index` (`parent_id`),

-->

KEY (parent_id, REV, id)

The order is critical. First it filters on parent_id =, then no a range of REV, and finally has id to make the index "covering".