MySQL Performance Issue with Indexed Datetime Column

datetimemyisamMySQLoptimizationperformance

I tried to solve the following problem for about one hour now and still didn't get any further with it.

Okay, I have a table (MyISAM):

+---------+-------------+------+-----+-------------------+----------------+
| Field   | Type        | Null | Key | Default           | Extra          |
+---------+-------------+------+-----+-------------------+----------------+
| id      | int(11)     | NO   | PRI | NULL              | auto_increment |
| http    | smallint(3) | YES  | MUL | 200               |                |
| elapsed | float(6,3)  | NO   |     | NULL              |                |
| cached  | tinyint(1)  | YES  |     | NULL              |                |
| ip      | int(11)     | NO   |     | NULL              |                |
| date    | timestamp   | NO   | MUL | CURRENT_TIMESTAMP |                |
+---------+-------------+------+-----+-------------------+----------------+

Please don't mind the indexes, I've been playing around trying to find a solution. Now, here's my query.

SELECT http,
COUNT( http )  AS count 
FROM reqs
WHERE DATE(date) >= cast(date_sub(date(NOW()),interval 24 hour) as datetime)
GROUP BY http
ORDER BY count;

the table is storing information about incoming web requests so its a rather big database.

+-----------+
| count(id) |
+-----------+
|    782412 |
+-----------+

note that there's no better way of setting a primary key as the id column will be the only unique identifier I have. The above mentioned query takes about 0.6-1.6 seconds to run.

Which index would be clever? I figured that indexing date will give me "bad" cardinality and thus MySQL won't use it. http is also a bad choice as there are only about 20 different possible values.

Thanks for you help!

Update 1 I've added an index on (http, date) as ypercube suggested:

mysql> CREATE INDEX httpDate ON reqs (http, date);

and used his query, but it performed equally bad. The added index:

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| reqs  |          0 | PRIMARY  |            1 | id          | A         |      798869 |     NULL | NULL   |      | BTREE      |         |
| reqs  |          1 | httpDate |            1 | http        | A         |          19 |     NULL | NULL   | YES  | BTREE      |         |
| reqs  |          1 | httpDate |            2 | date        | A         |       99858 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

and the EXPLAIN

+----+--------------------+-------+-------+---------------+----------+---------+------+-------+-----------------------------------------------------------+
| id | select_type        | table | type  | possible_keys | key      | key_len | ref  | rows  | Extra                                                     |
+----+--------------------+-------+-------+---------------+----------+---------+------+-------+-----------------------------------------------------------+
|  1 | PRIMARY            | r     | range | NULL          | httpDate | 3       | NULL |    20 | Using index for group-by; Using temporary; Using filesort |
|  2 | DEPENDENT SUBQUERY | ri    | ref   | httpDate      | httpDate | 3       | func | 41768 | Using where; Using index                                  |
+----+--------------------+-------+-------+---------------+----------+---------+------+-------+-----------------------------------------------------------+

MySQL server version:

mysql> SHOW VARIABLES LIKE "%version%";
+-------------------------+---------------------+
| Variable_name           | Value               |
+-------------------------+---------------------+
| protocol_version        | 10                  |
| version                 | 5.1.73              |
| version_comment         | Source distribution |
| version_compile_machine | x86_64              |
| version_compile_os      | redhat-linux-gnu    |
+-------------------------+---------------------+
5 rows in set (0.00 sec)

Best Answer

I have three suggestions

SUGGESTION #1 : Rewrite the query

You should rewrite the query as follows

SELECT http,
COUNT( http )  AS count 
FROM reqs
WHERE date >= ( DATE(NOW() - INTERVAL 1 DAY) + INTERVAL 0 SECOND )
GROUP BY http
ORDER BY count;

or

SELECT * FROM
(
    SELECT http,
    COUNT( http )  AS count 
    FROM reqs
    WHERE date >= ( DATE(NOW() - INTERVAL 1 DAY) + INTERVAL 0 SECOND )
    GROUP BY http
) A ORDER BY count;

The WHERE should not have a function on both sides of the equal sign. Having date on the left side of the equals sign makes it easier for the Query Optimizer to use an index against it.

SUGGESTION #2 : Supporting Index

I would also suggest a different index

ALTER TABLE reqs ADD INDEX date_http_ndx (date,http); -- not (http,date) 

I suggest this order of columns because the date entries would all be contiguous in the index. Then, the query simply collects http values without skipping gaps in http.

SUGGESTION #3 : Bigger Key Buffer (Optional)

MyISAM only uses index caching. Since the query should not touch the .MYD file, you should use a slightly bigger MyISAM Key Buffer.

To set it to 256M

SET @newsize = 1024 * 1024 * 256;
SET GLOBAL key_buffer_size = @newsize;

Then, set it in my.cnf

[mysqld]
key_buffer_size = 256M

Restart of MySQL not required

Give it a Try !!!