I have a 10 million record table in a MySQL 5.5 InnoDB. It's running in a 16GB RAM server with good CPU and fast HD. When I run the following query
SELECT DISTINCT knowledge_id,
tag_order,
tag_weight,
total_title_direct_words,
total_title_parenthesis_words,
tag_level
FROM knowledge_tags
WHERE ( tag_text = 'washington' )
AND ( tag_level < 10 );
It will take around 9 seconds (first run) locally (localhost). The result count is 640. There is a Btree index for the fields 'tag_text, tag_level'. This query time isn't acceptable for what i want. What can I do?
Here is the EXPLAIN result:
1 SIMPLE knowledge_tags range tags_by_word_text tags_by_word_text 308 [null] 344 Using index condition; Using temporary
And the CREATE TABLE statement:
CREATE TABLE `knowledge_tags` (
`tag_id` int(11) NOT NULL AUTO_INCREMENT,
`knowledge_id` int(11) DEFAULT NULL,
`dictionary_word_id` int(11) DEFAULT NULL,
`tag_text` varchar(100) DEFAULT NULL,
`tag_order` int(11) DEFAULT NULL,
`tag_level` int(11) DEFAULT NULL,
`knowledge_family_id` int(11) DEFAULT NULL,
`tag_weight` double DEFAULT '1',
`total_title_direct_words` int(11) DEFAULT NULL,
`total_title_parenthesis_words` int(11) DEFAULT NULL,
PRIMARY KEY (`tag_id`),
KEY `tags_by_knowledge_id` (`knowledge_id`),
KEY `tags_by_word_text` (`tag_text`,`tag_level`) USING BTREE,
KEY `tags_by_family_dictionary` (`knowledge_family_id`,`dictionary_word_id`,`tag_level`) USING BTREE,
KEY `tags_by_family_word_text` (`knowledge_family_id`,`tag_text`,`tag_level`) USING BTREE,
KEY `tags_by_dictionary` (`dictionary_word_id`,`tag_level`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=3316 DEFAULT CHARSET=utf8;
Best Answer
I suggest you add an index on all the columns used in the query, with this order (actually the order matters for the first 2 columns only, the rest 5 can be in any order within the index):
The query should then use this index for both the
WHERE
and theDISTINCT
(or group by) part. You probably need to change the query slightly, removing theDISTINCT
and usingGROUP BY
(this acts like a hint to the MySQL optimizer.)The
EXPLAIN
should then show:If the result are good enough for your expectations, you can then drop the
tags_by_word_text
index.