Mysql – Table partition should not be selected

MySQLmysql-5.1partitioning

I am looking to partition a table (nearly 30M rows) by date ranges
(mysql version : 5.1.73)

CREATE TABLE `TABLE_X` (
  `name_x` varchar(50) NOT NULL,
  `type_x` varchar(5) NOT NULL,
  `status_x` tinyint(1) NOT NULL,
  `date_x` date NOT NULL,
  KEY `xdate` (`date_x`)
)ENGINE=MyISAM DEFAULT CHARSET=latin1
PARTITION BY RANGE (TO_DAYS(`date_x`))
 (
    PARTITION p13q1 VALUES LESS THAN (TO_DAYS('2013-04-01')),
    PARTITION p13q2 VALUES LESS THAN (TO_DAYS('2013-07-01')),
    PARTITION p13q3 VALUES LESS THAN (TO_DAYS('2013-10-01')),
    PARTITION p13q4 VALUES LESS THAN (TO_DAYS('2014-01-01')),
    PARTITION p14q1 VALUES LESS THAN (TO_DAYS('2014-04-01')),
    PARTITION p14q2 VALUES LESS THAN (TO_DAYS('2014-07-01')),
    PARTITION p14q3 VALUES LESS THAN (TO_DAYS('2014-10-01')),
    PARTITION p14q4 VALUES LESS THAN (TO_DAYS('2015-01-01')),
    PARTITION p15q1 VALUES LESS THAN (TO_DAYS('2015-04-01')),
    PARTITION p15q2 VALUES LESS THAN (TO_DAYS('2015-07-01')),
    PARTITION p15q3 VALUES LESS THAN (TO_DAYS('2015-10-01')),
    PARTITION p15q4 VALUES LESS THAN (TO_DAYS('2016-01-01')),
    PARTITION p16q1 VALUES LESS THAN (TO_DAYS('2016-04-01'))
);

when i do a 'select' query to retrieve data between two dates from the same month, the result is correct

explain partitions select count(*) from TABLE_X where date_x between '2013-10-01' and '2013-10-27' \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: TABLE_X
   partitions: p13q4
         type: index
possible_keys: xdate
          key: xdate
      key_len: 4
          ref: NULL
         rows: 984405
        Extra: Using where; Using index

but when i try to retrieve rows between different months, i get this result

Eg 1:

explain partitions select count(*) from TABLE_X where date_x between '2013-10-01' and '2013-11-27' \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: TABLE_X
   partitions: p13q1,p13q4
         type: index
possible_keys: xdate
          key: xdate
      key_len: 4
          ref: NULL
         rows: 1562732
        Extra: Using where; Using index

this query should use only p13q4 partition

Eg 2:

    explain partitions select count(*) from TABLE_X where date_x between '2013-10-01' and '2014-01-27' \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: TABLE_X
   partitions: p13q1,p13q4,p14q1
         type: index
possible_keys: xdate
          key: xdate
      key_len: 4
          ref: NULL
         rows: 2085003
        Extra: Using where; Using index

this query should use p13q4 & p14q1 partitions.

I don't understand why it's using the p13q1 partition,

What am I missing?

Best Answer

This seems to be known problem with to_days and between - if there were some invalid dates in the table, to_days return NULL on them and stores them in the first partition. And when evaluating BETWEEN, mysql can sometimes decide that some dates in that range are invalid and so it thinks it has to check first partition to find them. Such dates giving NULL may be for instance '2013-11-00' which might theoreticaly fall to that between range so optimizer better checks for them - and there is probably no consensus if it is right or wrong..

Possible workaround would be to define special partition for null values. That would be empty most of the time, so performace should be better in some cases.