Mysql – Query optimization with extreme statistic

indexMySQLoptimization

I have a table:

mysql> describe table_1; 
+---------------+------------------+------+-----+---------------------+----------------+
| Field         | Type             | Null | Key | Default             | Extra          |
+---------------+------------------+------+-----+---------------------+----------------+
| id            | int(12)          | NO   | PRI | NULL                | auto_increment |
| date          | datetime         | YES  |     | 0000-00-00 00:00:00 |                |
| col1          | tinyint(1)       | YES  |     | 1                   |                |
| col2          | longtext         | YES  |     | NULL                |                |
| col3          | tinyint(1)       | YES  |     | 1                   |                |
+---------------+------------------+------+-----+---------------------+----------------+

with the following characteristics:

mysql> select distinct(col1), count(col1) from table_1 group by col1;
+--------+---------------+
| col1   | count(col1)   |
+--------+---------------+
|      0 |           200 |
|      1 |        689747 |
+--------+---------------+

How can I optimize queries like these, and with what kind of indexes?

1. SELECT * FROM table_1 WHERE col1 = 1
AND id NOT IN (79869)  ORDER BY  date  DESC LIMIT 0, 10;
2. SELECT * FROM table_1 WHERE col1 = 1 AND id <> 18983
AND date > '2012-10-26'  AND ( col2 LIKE '%word1%' OR  col2 LIKE '%word2%')
ORDER BY date DESC LIMIT 5;

Best Answer

At the risk of stating the obvious, you have a cardinality problem with col1.

Looking at the two queries you posted in the question:

SELECT * FROM table_1 WHERE col1 = 1
AND id NOT IN (79869) ORDER BY date DESC LIMIT 0, 10;

SELECT * FROM table_1 WHERE col1 = 1 AND id <> 18983
AND date > '2012-10-26' AND (col2 LIKE '%word1%' OR col2 LIKE '%word2%')
ORDER BY date DESC LIMIT 5;

you need good indexes that will accommodate these two queries. In your particular case, you will need just one index.

If the table is MyISAM, here is that index

ALTER TABLE table_1 ADD INDEX col1_date_ndx (col1,date,col2,id);

If the table is InnoDB, here is that index

ALTER TABLE table_1 ADD INDEX col1_date_ndx (col1,date,col2);

How will this index help? Although you have to live with the lopsided cardinality of col1, you improve each query's chances of being effectively searched.

You could also take a chance on refactoring the queries. How ???

Perhaps you could collect the ids only then join the ids back to the original table.

QUERY #1

Here is the first query

SELECT * FROM table_1 WHERE col1 = 1
AND id NOT IN (79869) ORDER BY date DESC LIMIT 0, 10;

Collect the ids and join back to table_1

SELECT B.* FROM
(SELECT id FROM table_1 WHERE col1 = 1
AND id NOT IN (79869) ORDER BY date DESC LIMIT 0, 10) A
LEFT JOIN table_1 B USING (id);

QUERY #2

Here is the second query

SELECT * FROM table_1 WHERE col1 = 1 AND id <> 18983
AND date > '2012-10-26' AND (col2 LIKE '%word1%' OR col2 LIKE '%word2%')
ORDER BY date DESC LIMIT 5;

Collect the ids and join back to table_1

SELECT B* FROM
(SELECT * FROM table_1 WHERE col1 = 1 AND id <> 18983
AND date > '2012-10-26' AND (col2 LIKE '%word1%' OR col2 LIKE '%word2%')
ORDER BY date DESC LIMIT 5) A
LEFT JOIN table_1 B USING (id);

I cannot make promises on the queries, but, at the very least, the index suggestion should help.

Give it a Try !!!