I am looking to partition a table (nearly 30M rows) by date ranges
(mysql version : 5.1.73)
CREATE TABLE `TABLE_X` (
`name_x` varchar(50) NOT NULL,
`type_x` varchar(5) NOT NULL,
`status_x` tinyint(1) NOT NULL,
`date_x` date NOT NULL,
KEY `xdate` (`date_x`)
)ENGINE=MyISAM DEFAULT CHARSET=latin1
PARTITION BY RANGE (TO_DAYS(`date_x`))
(
PARTITION p13q1 VALUES LESS THAN (TO_DAYS('2013-04-01')),
PARTITION p13q2 VALUES LESS THAN (TO_DAYS('2013-07-01')),
PARTITION p13q3 VALUES LESS THAN (TO_DAYS('2013-10-01')),
PARTITION p13q4 VALUES LESS THAN (TO_DAYS('2014-01-01')),
PARTITION p14q1 VALUES LESS THAN (TO_DAYS('2014-04-01')),
PARTITION p14q2 VALUES LESS THAN (TO_DAYS('2014-07-01')),
PARTITION p14q3 VALUES LESS THAN (TO_DAYS('2014-10-01')),
PARTITION p14q4 VALUES LESS THAN (TO_DAYS('2015-01-01')),
PARTITION p15q1 VALUES LESS THAN (TO_DAYS('2015-04-01')),
PARTITION p15q2 VALUES LESS THAN (TO_DAYS('2015-07-01')),
PARTITION p15q3 VALUES LESS THAN (TO_DAYS('2015-10-01')),
PARTITION p15q4 VALUES LESS THAN (TO_DAYS('2016-01-01')),
PARTITION p16q1 VALUES LESS THAN (TO_DAYS('2016-04-01'))
);
when i do a 'select' query to retrieve data between two dates from the same month, the result is correct
explain partitions select count(*) from TABLE_X where date_x between '2013-10-01' and '2013-10-27' \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: TABLE_X
partitions: p13q4
type: index
possible_keys: xdate
key: xdate
key_len: 4
ref: NULL
rows: 984405
Extra: Using where; Using index
but when i try to retrieve rows between different months, i get this result
Eg 1:
explain partitions select count(*) from TABLE_X where date_x between '2013-10-01' and '2013-11-27' \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: TABLE_X
partitions: p13q1,p13q4
type: index
possible_keys: xdate
key: xdate
key_len: 4
ref: NULL
rows: 1562732
Extra: Using where; Using index
this query should use only p13q4 partition
Eg 2:
explain partitions select count(*) from TABLE_X where date_x between '2013-10-01' and '2014-01-27' \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: TABLE_X
partitions: p13q1,p13q4,p14q1
type: index
possible_keys: xdate
key: xdate
key_len: 4
ref: NULL
rows: 2085003
Extra: Using where; Using index
this query should use p13q4 & p14q1 partitions.
I don't understand why it's using the p13q1 partition,
What am I missing?
Best Answer
This seems to be known problem with to_days and between - if there were some invalid dates in the table, to_days return NULL on them and stores them in the first partition. And when evaluating BETWEEN, mysql can sometimes decide that some dates in that range are invalid and so it thinks it has to check first partition to find them. Such dates giving NULL may be for instance '2013-11-00' which might theoreticaly fall to that between range so optimizer better checks for them - and there is probably no consensus if it is right or wrong..
Possible workaround would be to define special partition for null values. That would be empty most of the time, so performace should be better in some cases.