Mysql – Why search by date field requires engine to touch all records

explainindexMySQL

Why search by date field requires engine to touch all records? Is there any way to avoid it?

Following query

EXPLAIN select * FROM playdays where realdate>=Date('2010,01,01') ;

shows that amount of rows required to generate the result is 2725 which is higher than total amount of records in the table (2349).

Question #1: Why amount of rows to touch is higher than amount of records?

Question #2: Why amount of rows to touch is so high even though there is an index on the 'realdate' field? Is that really means query will touch all records? Should I optimize it? If yes, how?

Please advise! Any feedback is welcome! Thank you!

P.S. I use MYSQL 5.1.43 version, my table has realdate field, its type is 'DATE', table has totally 2349 records.

P.P.S. Answering to one of comments: selecting only realdate field reduces amount of touched records to 1362… if I vary date value used in search, amount of touched records changes depends of amount records left in the selections…

Hope, that additional information will help.

Best Answer

As @Razor commented, the number reported by EXPLAIN is only an estimate, and it can be off from your true row count by +/- 10% or even more.

MySQL prefers not to use an index at all if the portion of the table selected by the query conditions is too large. Think of the index at the back of a book. Why don't they index the word "the"? Because it would just list every page in the book. If the word occurs on too many pages in the book, it's actually easier to just read the book cover-to-cover instead of using an index.

Similarly, in MySQL if the optimizer estimates that the values you are searching for are found on a significantly high number of rows (in my experience "significant" is at least 16-18%) then MySQL skips the use of the index and just does a table-scan instead.

Sometimes this is the wrong choice, and you can give MySQL a hint that a table-scan is actually much more expensive than the preferred index choice:

EXPLAIN select * FROM playdays FORCE INDEX(realdate) where realdate>=Date('2010,01,01') ;

Be careful about using index hints too liberally, because you effectively hard-code your index choices into your queries by using them, and that may become the wrong strategy as your data changes over time. They should be used in exception cases, when you can demonstrate that the optimizer make the wrong choice.