Mysql – Optimize query or model, to get an population overview in time

group byindexindex-tuningMySQLoptimization

I think I've got a very simple model and query, but I cannot find a way to make use of indexes to boost performance.

I wonder if there is a trick in my model, or MySQL query, to speed up the execution time to get an overview of my population at a specific time.

I have the following table which holds the status (history) of my users:

CREATE TABLE `status` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `user_id` varchar(9) NOT NULL,
 `status` enum('NEW','REGISTERED','CANCELED') NOT NULL,
 `timestamp` datetime NOT NULL,
 `explanation` varchar(25) NOT NULL,

 PRIMARY KEY (`id`),
 KEY `index1` (`user_id`,`timestamp`,`id`),
 KEY `index2` (`timestamp`,`user_id`,`id`),

) ENGINE=InnoDB DEFAULT CHARSET=utf8

My table will hold a lot of users (5 million), which can have one, or more statuses (so the table will probably hold 10+ million records).
Records are inserted in chronological order, so ID and timestamp are "in sync" (timestamps are not required to be unique, but the autoincrement ID's are)

What I do to get an overview of my population at a specific time is:

SELECT MAX(id) FROM `status` WHERE timestamp <= NOW() GROUP BY user_id;

Which gives me the correct ID's, but since MySQL cannot use indexes when sorting on a column (for the where clause), which is not in the GROUP BY, there is no performance.

The explain tells me:

explain SELECT MAX(id) FROM `status` WHERE timestamp <= NOW() GROUP BY user_id;
+----+-------------+--------+-------+---------------------+---------------------+---------+------+----------+-----------------------------------------------------------+
| id | select_type | table  | type  | possible_keys       | key                 | key_len | ref  | rows     | Extra                                                     |
+----+-------------+--------+-------+---------------------+---------------------+---------+------+----------+-----------------------------------------------------------+
|  1 | SIMPLE      | status | range | index2              | index2              | 8       | NULL | 13897517 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+--------+-------+---------------------+---------------------+---------+------+----------+-----------------------------------------------------------+

Does anyone knows how I can optimize my model, or my query (for instance with subqueries) (, to allow MySQL to make use of indexes), and so boost performance?

What I eventually want is:

SELECT user_id, status, timestamp, explanation FROM `status` WHERE id IN (
  SELECT MAX(id) FROM `status` WHERE timestamp <= NOW() GROUP BY user_id
);

Best Answer

Ok, i tried to reproduse your case:

mysql> insert into status values (1,'a','NEW',NOW(),'test'); Query OK, 1 row affected (0.00 sec)

mysql> insert into status values (2,'b','NEW',NOW(),'test2'); Query OK, 1 row affected (0.00 sec)

mysql> insert into status values (3,'c','NEW',NOW(),'test3'); Query OK, 1 row affected (0.00 sec)

mysql> explain  SELECT MAX(id) FROM `status` WHERE timestamp <= NOW() GROUP BY user_id;
+----+-------------+--------+-------+---------------+--------+---------+------+------+--------------------------+
| id | select_type | table  | type  | possible_keys | key    | key_len | ref  | rows | Extra                    |
+----+-------------+--------+-------+---------------+--------+---------+------+------+--------------------------+
|  1 | SIMPLE      | status | index | index2        | index1 | 41      | NULL |    3 | Using where; Using index |
+----+-------------+--------+-------+---------------+--------+---------+------+------+--------------------------+

Mysql uses index like you wanted.

And then:

mysql> alter table status drop index index1;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> explain  SELECT MAX(id) FROM `status` WHERE timestamp <= NOW() GROUP BY user_id;
+----+-------------+--------+-------+---------------+--------+---------+------+------+-----------------------------------------------------------+
| id | select_type | table  | type  | possible_keys | key    | key_len | ref  | rows | Extra                                                     |
+----+-------------+--------+-------+---------------+--------+---------+------+------+-----------------------------------------------------------+
|  1 | SIMPLE      | status | index | index2        | index2 | 41      | NULL |    3 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+--------+-------+---------------+--------+---------+------+------+-----------------------------------------------------------+
1 row in set (0.00 sec)

The plan is like yours. Please ensure that index1 is really present.

I am runnig mysql 5.5.33-31.1-log You can even try 5.6.

Also please try to force index1 and post the results.