MySQL Cluster – Resolving Slow Three-Table JOIN Query

MySQLmysql-clusterndbclusterperformancequery-performance

I have a select query called by a script to our MySQL Cluster that has gone from running in a second or two to taking well over a minute per query (and it has to run this query 24000 times). Nothing has changed code-wise in the few months since it was last run. I'm not a proper DBA, just a long-time MySQL user, and I read on another thread that it could be an issue with the indexes outgrowing the limit of the key_cache size? But would this apply to ndbcluster table types? And if so, how would I tell?

The output of Explain for the problem query is:

EXPLAIN SELECT userid from users left outer join routes on routes.user_id = users.id left outer join aliases on aliases.route_id = routes.id where aliases.name='xxxx' and domain_id=8;

+----+-------------+---------+------+-----------------+----------+---------+------------------+--------+-----------------------------------+ 
| id | select_type | table   | type | possible_keys   | key      | key_len | ref              | rows   | Extra                            |
+----+-------------+---------+------+-----------------+----------+---------+------------------+--------+-----------------------------------+ 
|  1 | SIMPLE      | users   | ALL  | PRIMARY         | NULL     | NULL    | NULL             | 177802 |                                  | 
|  1 | SIMPLE      | routes  | ref  | PRIMARY,user_id | user_id  | 4       | vnames.users.id  |      1 | Using where                       | 
|  1 | SIMPLE      | aliases | ref  | route_id        | route_id | 4       | vnames.routes.id |      1 | Using where with pushed condition |
+----+-------------+---------+------+-----------------+----------+---------+------------------+--------+-----------------------------------+

And the output of Show Indexes for the three tables is:

+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table   | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| aliases |          0 | PRIMARY  |            1 | id          | A         |      377197 |     NULL | NULL   |      | BTREE      |         |
| aliases |          1 | route_id |            1 | route_id    | A         |      377197 |     NULL | NULL   |      | BTREE      |         |
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| routes |          0 | PRIMARY  |            1 | id          | A         |      208442 |     NULL | NULL   |      | BTREE      |         |
| routes |          1 | user_id  |            1 | user_id     | A         |      208442 |     NULL | NULL   |      | BTREE      |         |
| routes |          1 | address  |            1 | address     | A         |      208442 |     NULL | NULL   |      | BTREE      |         |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| users |          0 | PRIMARY  |            1 | id          | A         |      177803 |     NULL | NULL   |      | BTREE      |         |
| users |          1 | userid   |            1 | userid      | A         |      177803 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

Any help/suggestions would be greatly appreciated.

Best Answer

I suspect this query has died due to the volume in the tables. Running a query that returns all userids 24000 times will eventually cause performance issues.

At some point you your tables grow to the point they don't fit in memory (very fast access) and you end up reading data from disk (very slow queries for most queries. This is a sudden switch. One query I wrote went from sub-second to five minutes when this happened. If you can compact the tables, this may resolve the condition for a while. Adding more memory may help. (Even if your database doesn't have enough buffer space, the O/S may buffer the data for you.)

This is how I formatted the query to understand it.

SELECT userid 
FROM   users 
LEFT OUTER JOIN routes 
ON     routes.user_id = users.id 
LEFT OUTER JOIN aliases 
ON     aliases.route_id = routes.id 
WHERE  aliases.name = 'xxxx' 
AND    domain_id = 8;

If you are searching for aliases that match the aliases.name, it would help to index that column.

Based on the LEFT OUTER JOINS, you should get the userid for all users or all users with domain_id = 8. As I read this query it is a long way of writing:

SELECT userid 
FROM   users;

Or

SELECT userid 
from   users
WHERE  domain_id = 8;

However I expect you want the user id when the alias and domain match.

SELECT userid 
FROM   users 
JOIN   routes 
ON     routes.user_id = users.id 
JOIN   aliases 
ON     aliases.route_id = routes.id 
WHERE  aliases.name = 'xxxx' 
AND    domain_id = 8;

If you are looking for the userid matching the aliases.name value, it would help to index aliases.name.