Mysql – Optimizing mariabb/thesql queries as a DBA

mariadbMySQLoptimizationperformance

I'm maintaining a database that's used by a bespoke software application that has some performance issues. On of the worst offenders is query (3) below, and I'd like to, if possible, optimize the execution plan from the database side, since asking the developers to change anything tends to be costly and take quite a while.

(1) This query is fast, using indexes:

explain
select game.id from games.game where home = 573368 and active = 1
union
select game.id from games.game where guest=573368 and active = 1
union
select game.id from games.game, games.gamegroup where gamegroup.game=game.id and gamegroup.team=573368 and active=1;
+------+--------------+--------------+--------+----------------+---------+---------+----------------------+------+-------------+
| id   | select_type  | table        | type   | possible_keys  | key     | key_len | ref                  | rows | Extra       |
+------+--------------+--------------+--------+----------------+---------+---------+----------------------+------+-------------+
|    1 | PRIMARY      | game         | ref    | active,home    | home    | 5       | const                |   10 | Using where |
|    2 | UNION        | game         | ref    | active,guest   | guest   | 5       | const                |   10 | Using where |
|    3 | UNION        | gamegroup    | ref    | game,team      | team    | 5       | const                |    1 | Using where |
|    3 | UNION        | game         | eq_ref | PRIMARY,active | PRIMARY | 4       | games.gamegroup.game |    1 | Using where |
| NULL | UNION RESULT | <union1,2,3> | ALL    | NULL           | NULL    | NULL    | NULL                 | NULL |             |
+------+--------------+--------------+--------+----------------+---------+---------+----------------------+------+-------------+ 

(2) Taking the results of this into a second query is fast as well:

explain
select * from games.game where id in (3661383, 3661488, 3661543, 3661598, 3661668, 3661718, 3661743, 3661808, 3661868, 3661893, 3661393, 3661443, 3661483, 3661483, 3661533, 3661578, 3661623, 3661658, 3661763, 3661818, 3661878);
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
|    1 | SIMPLE      | game  | range | PRIMARY       | PRIMARY | 4       | NULL |   20 | Using where |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+

(3) However, plugging the first query into the second as a subselect results in a full table scan (FTS) and is accordingly slow:

explain
select * from games.game where id in (
    select game.id from games.game where home = 573368 and active = 1
    union
    select game.id from games.game where guest=573368 and active = 1
    union
    select game.id from games.game, games.gamegroup where gamegroup.game=game.id and gamegroup.team=573368 and active=1
);
+------+--------------------+--------------+--------+----------------------+---------+---------+-------+--------+-------------+
| id   | select_type        | table        | type   | possible_keys        | key     | key_len | ref   | rows   | Extra       |
+------+--------------------+--------------+--------+----------------------+---------+---------+-------+--------+-------------+
|    1 | PRIMARY            | game         | ALL    | NULL                 | NULL    | NULL    | NULL  | 917603 | Using where |
|    2 | DEPENDENT SUBQUERY | game         | eq_ref | PRIMARY,active,home  | PRIMARY | 4       | func  |      1 | Using where |
|    3 | DEPENDENT UNION    | game         | eq_ref | PRIMARY,active,guest | PRIMARY | 4       | func  |      1 | Using where |
|    4 | DEPENDENT UNION    | gamegroup    | ref    | game,team            | team    | 5       | const |      1 |             |
|    4 | DEPENDENT UNION    | game         | eq_ref | PRIMARY              | PRIMARY | 4       | func  |      1 | Using where |
| NULL | UNION RESULT       | <union2,3,4> | ALL    | NULL                 | NULL    | NULL    | NULL  |   NULL |             |
+------+--------------------+--------------+--------+----------------------+---------+---------+-------+--------+-------------+

(4) Nesting another subselect which does effectively nothing except giving the thing a name avoids the FTS, making the query fast again:

explain
select * from games.game where id in (
    select  * from (
        select game.id from games.game where home = 573368 and active = 1
        union
        select game.id from games.game where guest=573368 and active = 1
        union
        select game.id from games.game, games.gamegroup where gamegroup.game=game.id and gamegroup.team=573368 and active=1
    ) as uni
);
+------+--------------+--------------+--------+----------------+---------+---------+----------------------+------+-------------+
| id   | select_type  | table        | type   | possible_keys  | key     | key_len | ref                  | rows | Extra       |
+------+--------------+--------------+--------+----------------+---------+---------+----------------------+------+-------------+
|    1 | PRIMARY      | <subquery2>  | ALL    | distinct_key   | NULL    | NULL    | NULL                 |   21 |             |
|    1 | PRIMARY      | game         | eq_ref | PRIMARY        | PRIMARY | 4       | uni.id               |    1 |             |
|    2 | MATERIALIZED | <derived3>   | ALL    | NULL           | NULL    | NULL    | NULL                 |   21 |             |
|    3 | DERIVED      | game         | ref    | active,home    | home    | 5       | const                |   10 | Using where |
|    4 | UNION        | game         | ref    | active,guest   | guest   | 5       | const                |   10 | Using where |
|    5 | UNION        | gamegroup    | ref    | game,team      | team    | 5       | const                |    1 | Using where |
|    5 | UNION        | game         | eq_ref | PRIMARY,active | PRIMARY | 4       | games.gamegroup.game |    1 | Using where |
| NULL | UNION RESULT | <union3,4,5> | ALL    | NULL           | NULL    | NULL    | NULL                 | NULL |             |
+------+--------------+--------------+--------+----------------+---------+---------+----------------------+------+-------------+

Unfortunately, the application, which isn't easy to get changed, uses query (3).

Are there any steps that can be taken from the DBA side to nudge the execution plan of (3) into the direction of (4)?

I did read https://dev.mysql.com/doc/refman/5.7/en/optimizer-hints.html, which needs the application to pass hints through, and https://dev.mysql.com/doc/refman/5.7/en/switchable-optimizations.html, which seems to affect all queries on a global basis. However, I'd like to give the DB a hint just for this query. Is there any way, in mysql/mariadb, to do that? Or is there another workaround?


As requested in a comment, here are the table definitions:

CREATE TABLE `game` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `no` int(11) NOT NULL DEFAULT 0,
  `class` int(11) NOT NULL DEFAULT 0,
  `dt` datetime DEFAULT NULL,
  `gdate` char(8) NOT NULL DEFAULT '',
  `sgdate` int(11) DEFAULT NULL,
  `gtime` char(5) DEFAULT NULL,
  `gymnasium` int(11) DEFAULT NULL,
  `ogymnasium` int(11) DEFAULT NULL,
  `resp` int(11) DEFAULT NULL,
  `oresp` int(11) DEFAULT NULL,
  `home` int(11) DEFAULT NULL,
  `guest` int(11) DEFAULT NULL,
  `hgoals` int(11) DEFAULT NULL,
  `ggoals` int(11) DEFAULT NULL,
  `hgoals1` int(11) DEFAULT NULL,
  `ggoals1` int(11) DEFAULT NULL,
  `hpoints` int(11) DEFAULT NULL,
  `gpoints` int(11) DEFAULT NULL,
  `admin` int(11) NOT NULL DEFAULT 0,
  `period` int(11) NOT NULL DEFAULT 0,
  `comment` char(30) DEFAULT NULL,
  `mdt` datetime DEFAULT NULL,
  `user` int(11) DEFAULT NULL,
  `disposer` int(11) DEFAULT NULL,
  `disposer_org` int(11) DEFAULT NULL,
  `sms` int(11) DEFAULT NULL,
  `sms2check` smallint(6) DEFAULT NULL,
  `active` int(11) DEFAULT NULL,
  `req` int(11) DEFAULT NULL,
  `reqtype` int(11) DEFAULT NULL,
  `groupsort` int(11) DEFAULT NULL,
  `round` int(11) DEFAULT 0,
  `editstatus` int(11) DEFAULT 0,
  `ti_err` smallint(6) DEFAULT NULL,
  `refereeinfo_a` int(11) DEFAULT NULL,
  `refereeinfo_b` int(11) DEFAULT NULL,
  `ref_a_adrnr` int(11) DEFAULT NULL,
  `ref_b_adrnr` int(11) DEFAULT NULL,
  `ref_org` int(11) DEFAULT 0,
  `ref_costs` float DEFAULT NULL,
  `ref_sealed` int(11) DEFAULT NULL,
  `sealed` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `gdate` (`gdate`),
  KEY `class` (`class`),
  KEY `period` (`period`),
  KEY `sgdate` (`sgdate`),
  KEY `active` (`active`),
  KEY `dt` (`dt`),
  KEY `admin` (`admin`),
  KEY `sms2check` (`sms2check`),
  KEY `gymnasium` (`gymnasium`),
  KEY `home` (`home`),
  KEY `guest` (`guest`),
  KEY `groupsort` (`groupsort`),
  KEY `resp` (`resp`)
) ENGINE=InnoDB AUTO_INCREMENT=3796577 DEFAULT CHARSET=latin1



CREATE TABLE `gamegroup` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `game` int(11) DEFAULT NULL,
  `team` int(11) DEFAULT NULL,
  `sort` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `game` (`game`),
  KEY `team` (`team`)
) ENGINE=InnoDB AUTO_INCREMENT=46302 DEFAULT CHARSET=latin1

Best Answer

Add composite keys:

games:  INDEX(home, active, id)
games:  INDEX(guest, active, id)
gamegroup:  INDEX(team, game)

Shrink the data:

INT always takes 4 bytes. TINYINT UNSIGNED takes 1 bytes and is probably more than adequate for period, active etc.

If gamegroup is a many-to-many mapping between teams and games, you probably don't need id. Instead:

CREATE TABLE `gamegroup` (
  `game` int(11) DEFAULT NULL,
  `team` int(11) DEFAULT NULL,
  `sort` int(11) DEFAULT NULL,
  PRIMARY KEY (team, game),
  KEY         (game, team)
) ENGINE=InnoDB AUTO_INCREMENT=46302 DEFAULT CHARSET=latin1

Formulation 5:

select g.*
    FROM (
        select game.id from games.game where home = 573368 and active = 1
        union
        select game.id from games.game where guest=573368 and active = 1
        union
        select game.id
            FROM games.game
            JOIN games.gamegroup
            where gamegroup.game = game.id
              and gamegroup.team=573368
              and active=1
         ) as uni
    JOIN game AS g  USING(id)
);

This effectively does '1', then '2'.

UNION ALL is usually faster than UNION DISTINCT (which is what UNION means). But you can get dups.

IN ( SELECT ... ) should be avoided -- there are many cases (especially in older MySQL versions) where it performs terribly.

More tips: http://mysql.rjweb.org/doc.php/index_cookbook_mysql

Even better Provide a Stored Procedure that 573368 as an argument and delivers the result set using whatever is deemed the optimal formulation. They will prefer doing CALL instead of typing all that lengthy query. That may get them to change their code.