I'm maintaining a database that's used by a bespoke software application that has some performance issues. On of the worst offenders is query (3) below, and I'd like to, if possible, optimize the execution plan from the database side, since asking the developers to change anything tends to be costly and take quite a while.
(1) This query is fast, using indexes:
explain
select game.id from games.game where home = 573368 and active = 1
union
select game.id from games.game where guest=573368 and active = 1
union
select game.id from games.game, games.gamegroup where gamegroup.game=game.id and gamegroup.team=573368 and active=1;
+------+--------------+--------------+--------+----------------+---------+---------+----------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------+--------------+--------+----------------+---------+---------+----------------------+------+-------------+
| 1 | PRIMARY | game | ref | active,home | home | 5 | const | 10 | Using where |
| 2 | UNION | game | ref | active,guest | guest | 5 | const | 10 | Using where |
| 3 | UNION | gamegroup | ref | game,team | team | 5 | const | 1 | Using where |
| 3 | UNION | game | eq_ref | PRIMARY,active | PRIMARY | 4 | games.gamegroup.game | 1 | Using where |
| NULL | UNION RESULT | <union1,2,3> | ALL | NULL | NULL | NULL | NULL | NULL | |
+------+--------------+--------------+--------+----------------+---------+---------+----------------------+------+-------------+
(2) Taking the results of this into a second query is fast as well:
explain
select * from games.game where id in (3661383, 3661488, 3661543, 3661598, 3661668, 3661718, 3661743, 3661808, 3661868, 3661893, 3661393, 3661443, 3661483, 3661483, 3661533, 3661578, 3661623, 3661658, 3661763, 3661818, 3661878);
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | game | range | PRIMARY | PRIMARY | 4 | NULL | 20 | Using where |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
(3) However, plugging the first query into the second as a subselect results in a full table scan (FTS) and is accordingly slow:
explain
select * from games.game where id in (
select game.id from games.game where home = 573368 and active = 1
union
select game.id from games.game where guest=573368 and active = 1
union
select game.id from games.game, games.gamegroup where gamegroup.game=game.id and gamegroup.team=573368 and active=1
);
+------+--------------------+--------------+--------+----------------------+---------+---------+-------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------------+--------------+--------+----------------------+---------+---------+-------+--------+-------------+
| 1 | PRIMARY | game | ALL | NULL | NULL | NULL | NULL | 917603 | Using where |
| 2 | DEPENDENT SUBQUERY | game | eq_ref | PRIMARY,active,home | PRIMARY | 4 | func | 1 | Using where |
| 3 | DEPENDENT UNION | game | eq_ref | PRIMARY,active,guest | PRIMARY | 4 | func | 1 | Using where |
| 4 | DEPENDENT UNION | gamegroup | ref | game,team | team | 5 | const | 1 | |
| 4 | DEPENDENT UNION | game | eq_ref | PRIMARY | PRIMARY | 4 | func | 1 | Using where |
| NULL | UNION RESULT | <union2,3,4> | ALL | NULL | NULL | NULL | NULL | NULL | |
+------+--------------------+--------------+--------+----------------------+---------+---------+-------+--------+-------------+
(4) Nesting another subselect which does effectively nothing except giving the thing a name avoids the FTS, making the query fast again:
explain
select * from games.game where id in (
select * from (
select game.id from games.game where home = 573368 and active = 1
union
select game.id from games.game where guest=573368 and active = 1
union
select game.id from games.game, games.gamegroup where gamegroup.game=game.id and gamegroup.team=573368 and active=1
) as uni
);
+------+--------------+--------------+--------+----------------+---------+---------+----------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------+--------------+--------+----------------+---------+---------+----------------------+------+-------------+
| 1 | PRIMARY | <subquery2> | ALL | distinct_key | NULL | NULL | NULL | 21 | |
| 1 | PRIMARY | game | eq_ref | PRIMARY | PRIMARY | 4 | uni.id | 1 | |
| 2 | MATERIALIZED | <derived3> | ALL | NULL | NULL | NULL | NULL | 21 | |
| 3 | DERIVED | game | ref | active,home | home | 5 | const | 10 | Using where |
| 4 | UNION | game | ref | active,guest | guest | 5 | const | 10 | Using where |
| 5 | UNION | gamegroup | ref | game,team | team | 5 | const | 1 | Using where |
| 5 | UNION | game | eq_ref | PRIMARY,active | PRIMARY | 4 | games.gamegroup.game | 1 | Using where |
| NULL | UNION RESULT | <union3,4,5> | ALL | NULL | NULL | NULL | NULL | NULL | |
+------+--------------+--------------+--------+----------------+---------+---------+----------------------+------+-------------+
Unfortunately, the application, which isn't easy to get changed, uses query (3).
Are there any steps that can be taken from the DBA side to nudge the execution plan of (3) into the direction of (4)?
I did read https://dev.mysql.com/doc/refman/5.7/en/optimizer-hints.html, which needs the application to pass hints through, and https://dev.mysql.com/doc/refman/5.7/en/switchable-optimizations.html, which seems to affect all queries on a global basis. However, I'd like to give the DB a hint just for this query. Is there any way, in mysql/mariadb, to do that? Or is there another workaround?
As requested in a comment, here are the table definitions:
CREATE TABLE `game` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`no` int(11) NOT NULL DEFAULT 0,
`class` int(11) NOT NULL DEFAULT 0,
`dt` datetime DEFAULT NULL,
`gdate` char(8) NOT NULL DEFAULT '',
`sgdate` int(11) DEFAULT NULL,
`gtime` char(5) DEFAULT NULL,
`gymnasium` int(11) DEFAULT NULL,
`ogymnasium` int(11) DEFAULT NULL,
`resp` int(11) DEFAULT NULL,
`oresp` int(11) DEFAULT NULL,
`home` int(11) DEFAULT NULL,
`guest` int(11) DEFAULT NULL,
`hgoals` int(11) DEFAULT NULL,
`ggoals` int(11) DEFAULT NULL,
`hgoals1` int(11) DEFAULT NULL,
`ggoals1` int(11) DEFAULT NULL,
`hpoints` int(11) DEFAULT NULL,
`gpoints` int(11) DEFAULT NULL,
`admin` int(11) NOT NULL DEFAULT 0,
`period` int(11) NOT NULL DEFAULT 0,
`comment` char(30) DEFAULT NULL,
`mdt` datetime DEFAULT NULL,
`user` int(11) DEFAULT NULL,
`disposer` int(11) DEFAULT NULL,
`disposer_org` int(11) DEFAULT NULL,
`sms` int(11) DEFAULT NULL,
`sms2check` smallint(6) DEFAULT NULL,
`active` int(11) DEFAULT NULL,
`req` int(11) DEFAULT NULL,
`reqtype` int(11) DEFAULT NULL,
`groupsort` int(11) DEFAULT NULL,
`round` int(11) DEFAULT 0,
`editstatus` int(11) DEFAULT 0,
`ti_err` smallint(6) DEFAULT NULL,
`refereeinfo_a` int(11) DEFAULT NULL,
`refereeinfo_b` int(11) DEFAULT NULL,
`ref_a_adrnr` int(11) DEFAULT NULL,
`ref_b_adrnr` int(11) DEFAULT NULL,
`ref_org` int(11) DEFAULT 0,
`ref_costs` float DEFAULT NULL,
`ref_sealed` int(11) DEFAULT NULL,
`sealed` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `gdate` (`gdate`),
KEY `class` (`class`),
KEY `period` (`period`),
KEY `sgdate` (`sgdate`),
KEY `active` (`active`),
KEY `dt` (`dt`),
KEY `admin` (`admin`),
KEY `sms2check` (`sms2check`),
KEY `gymnasium` (`gymnasium`),
KEY `home` (`home`),
KEY `guest` (`guest`),
KEY `groupsort` (`groupsort`),
KEY `resp` (`resp`)
) ENGINE=InnoDB AUTO_INCREMENT=3796577 DEFAULT CHARSET=latin1
CREATE TABLE `gamegroup` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`game` int(11) DEFAULT NULL,
`team` int(11) DEFAULT NULL,
`sort` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `game` (`game`),
KEY `team` (`team`)
) ENGINE=InnoDB AUTO_INCREMENT=46302 DEFAULT CHARSET=latin1
Best Answer
Add composite keys:
Shrink the data:
INT
always takes 4 bytes.TINYINT UNSIGNED
takes 1 bytes and is probably more than adequate forperiod
,active
etc.If
gamegroup
is a many-to-many mapping between teams and games, you probably don't needid
. Instead:Formulation 5:
This effectively does '1', then '2'.
UNION ALL
is usually faster thanUNION DISTINCT
(which is whatUNION
means). But you can get dups.IN ( SELECT ... )
should be avoided -- there are many cases (especially in older MySQL versions) where it performs terribly.More tips: http://mysql.rjweb.org/doc.php/index_cookbook_mysql
Even better Provide a Stored Procedure that
573368
as an argument and delivers the result set using whatever is deemed the optimal formulation. They will prefer doingCALL
instead of typing all that lengthy query. That may get them to change their code.