I'm running a MySQL 5.1 server and I have a MyISAM table called my_table
with nearly 5 million records. Its structure is the following:
+-----------------------------------------------------------------------+
|Field |Type |Null|Key|Default |Extra |
+-----------------------------------------------------------------------+
|id |int(11) |NO |PRI| |auto_increment|
+-----------------------------------------------------------------------+
|article_id |int(11) |NO |MUL|0 | |
+-----------------------------------------------------------------------+
|user_id |int(11) |NO |MUL|0 | |
+-----------------------------------------------------------------------+
|timestamp |datetime |YES |MUL|0000-00-00 00:00:00| |
+-----------------------------------------------------------------------+
|txt |text |YES | | | |
+-----------------------------------------------------------------------+
|soukrome |tinyint(4)|NO |MUL|0 | |
+-----------------------------------------------------------------------+
|vazna_kritika |tinyint(4)|NO | |0 | |
+-----------------------------------------------------------------------+
|kos |tinyint(4)|NO |MUL|0 | |
+-----------------------------------------------------------------------+
|done |tinyint(4)|NO |MUL|0 | |
+-----------------------------------------------------------------------+
|autor_id |int(11) |NO |MUL|0 | |
+-----------------------------------------------------------------------+
|sbirka_id |int(11) |NO |MUL|0 | |
+-----------------------------------------------------------------------+
|klub_id |int(11) |NO |MUL|0 | |
+-----------------------------------------------------------------------+
|klub_id_procleny|int(11) |NO |MUL|0 | |
+-----------------------------------------------------------------------+
|notinlist |tinyint(1)|NO |MUL|0 | |
+-----------------------------------------------------------------------+
|admin_id |int(11) |NO | |0 | |
+-----------------------------------------------------------------------+
|kritika_redakce |tinyint(1)|NO | |0 | |
+-----------------------------------------------------------------------+
Now, when I run this simple query that should return the time of the latest comment, the query hangs for several minutes:
SELECT timestamp FROM my_table WHERE article_id = 123456 ORDER BY timestamp DESC LIMIT 1
The strange thing about this is that for most article_id
s the query finishes almost instantly. It only hangs for some article_id
s.
EXPLAIN
ing the query doesn't help. The execution plan seems pretty standard:
+----------------------------------------------------------------------------------------------------+
|id|select_type|table |type|possible_keys|key |key_len|ref |rows|Extra |
+----------------------------------------------------------------------------------------------------+
|1 |SIMPLE |my_table|ref |article_id |article_id|4 |const|1 |Using where; Using filesort|
+----------------------------------------------------------------------------------------------------+
Now, the weirdest thing is, that if I rewrite the query like this, everything starts to behave correctly:
SELECT * FROM ( SELECT timestamp FROM my_table WHERE article_id = 123456 ) _my_table ORDER BY timestamp DESC LIMIT 1
I'm just really curious if anyone has any idea why such an issue might be happening?
EDIT: Adding SHOW CREATE TABLE and SHOW FULL PROCESSLIST.
CREATE TABLE `my_table` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`article_id` int(11) NOT NULL DEFAULT '0',
`user_id` int(11) NOT NULL DEFAULT '0',
`timestamp` datetime DEFAULT '0000-00-00 00:00:00',
`txt` text,
`soukrome` tinyint(4) NOT NULL DEFAULT '0',
`vazna_kritika` tinyint(4) NOT NULL DEFAULT '0',
`kos` tinyint(4) NOT NULL DEFAULT '0',
`done` tinyint(4) NOT NULL DEFAULT '0',
`autor_id` int(11) NOT NULL DEFAULT '0',
`sbirka_id` int(11) NOT NULL DEFAULT '0',
`klub_id` int(11) NOT NULL DEFAULT '0',
`klub_id_procleny` int(11) NOT NULL DEFAULT '0',
`notinlist` tinyint(1) NOT NULL DEFAULT '0',
`admin_id` int(11) NOT NULL DEFAULT '0',
`kritika_redakce` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `timestamp` (`timestamp`),
KEY `user_id` (`user_id`),
KEY `autor_id` (`autor_id`),
KEY `sbirka_id` (`sbirka_id`),
KEY `klub_id` (`klub_id`),
KEY `article_id` (`article_id`),
KEY `no` (`kos`),
KEY `done` (`done`),
KEY `soukrome` (`soukrome`),
KEY `klub_id_procleny` (`klub_id_procleny`),
KEY `notinlist` (`notinlist`)
) ENGINE=MyISAM AUTO_INCREMENT=4470321 DEFAULT CHARSET=utf8
If I remember it correctly, the processlist showed something along these lines (I'm unable to reproduce the issue on the fixed query and I don't remember the problematic article_id
s):
Id User Host db Command Time State Info
74814 root localhost p2 Query 62 Sending data SELECT timestamp FROM...
74815 root localhost p2 Query 62 Sleep
Best Answer
As your table is structured now, there are only two "good" options for satisfying your query, neither one ideal, and one of them isn't likely to be very good at all:
The thing is, the execution plan can actually change for a given query depending on the specific argument in the
WHERE
clause. It could be that you're getting a different plan for particular values -- the question seems to imply that yourEXPLAIN
output may not have been from one of the problematic rows, so this may not be the plan chosen for the other article_id values.If I'm correct that you haven't captured the query plan for one of the problematic values, it could be that the subquery makes your intentions somehow more obvious and changes the plan.
Really, though, if this is a common query, you need an index on both columns together.
This index should always be the obvious choice from the optimizer's perspective, since it knows exactly where to find the largest timestamp for any given article, so this index should eliminate "Using where" and "Using filesort" and result in all such queries being faster, though the difference may be less significant for some queries that are performing acceptably now.