MySQL InnoDB – Optimize Simple SELECT Query for 2 Million Rows

innodbmyisamMySQLoptimizationselect

I'd like to make this query the fastest possible, since it's called really often.

SELECT login_events.id 
FROM login_events 
WHERE (
  DATE(created_at) >= DATE(CURRENT_DATE) 
  AND person_id = 1
) 
LIMIT 1

It's running on a 157.7mb InnoDB table (says navicat), with ~2million rows and indexed on [created_at, person_id].

Using EXPLAIN I see it's using the index, but it says "Using where; Using index;". What can I do to make this the fastest possible? Would switching to MyISAM gain me anything?

Best Answer

First, what you need to focus on are the three fields in the query

  • id
  • created_at
  • person_id

The index you have (created_at,person_id) will make the query do an index scan across all the days of created_at after CURRENT_DATE looking for the person_id.

SUGGESTION #1 : You will definitely need a different index

MyISAM

If login_events is MyISAM, this is the index you need

ALTER TABLE login_events ADD INDEX person_date_ndx (person_id,created_at,id);

This changes the query because the query will look for the specific person_id and scan all days for person_id 1 only. The reason id is included in the index ? The query will retrieve the id from the index only file rather than the table. That way, all 3 fields are retrieved from the index file instead of 2 fields from the index and 1 from the table.

InnoDB

If login_events is InnoDB, this is the index you need

ALTER TABLE login_events ADD INDEX person_date_ndx (person_id,created_at);

The reason I recommend this is the same, but you do not need to include id. Why? All index pages include an index point back to the clustered index so retrieval of an index will intrinsically access the row anyway, thus accessing id. Adding id to the index would simply be redundant.

SUGGESTION #2 : Change the Date Comparison

From the expression

DATE(created_at) >= DATE(CURRENT_DATE) 

I can tell that created_at is either DATETIME or TIMESTAMP.

The expression forces the query to convert every row's DATETIME value of created_on into a DATE.

Therefore, instead of

SELECT login_events.id 
FROM login_events 
WHERE (
  DATE(created_at) >= DATE(CURRENT_DATE) 
  AND person_id = 1
) 
LIMIT 1

express the date comparison as a time comparison starting from midnight of today

SELECT login_events.id 
FROM login_events 
WHERE (
  created_at >= (DATE(NOW()) + INTERVAL 0 SECOND)
  AND person_id = 1
) 
LIMIT 1

CAVEAT

Since the table is so small, either storage engine would be fine. I would give the edge to MyISAM.

Give it a Try !!!