Mysql – Primary key index with a DATETIME as first part of the compound key is never used

indexMySQLmysql-5.5primary-key

I have a problem with INDEXING a DATETIME (or even a date) as first part of my PRIMARY KEY.

I use MySQL 5.5

Here are my two tables:

-- This is my standard table with dateDim as a dateTime

CREATE TABLE `stats` (
 `dateDim` datetime NOT NULL,
 `accountDim` mediumint(8) unsigned NOT NULL,
 `execCodeDim` smallint(5) unsigned NOT NULL,
 `operationTypeDim` tinyint(3) unsigned NOT NULL,
 `junkDim` tinyint(3) unsigned NOT NULL,
 `ipCountryDim` smallint(5) unsigned NOT NULL,
 `count` int(10) unsigned NOT NULL,
 `amount` bigint(20) NOT NULL,
 PRIMARY KEY (`dateDim`,`accountDim`,`execCodeDim`,`operationTypeDim`,`junkDim`,`ipCountryDim`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8


-- Here is a copy with datDim as an integer

CREATE TABLE `stats_todays` (
`dateDim` int(11) unsigned NOT NULL,
 `accountDim` mediumint(8) unsigned NOT NULL,
 `execCodeDim` smallint(5) unsigned NOT NULL,
 `operationTypeDim` tinyint(3) unsigned NOT NULL,
 `junkDim` tinyint(3) unsigned NOT NULL,
 `ipCountryDim` smallint(5) unsigned NOT NULL,
 `count` int(10) unsigned NOT NULL,
 `amount` bigint(20) NOT NULL,
 PRIMARY KEY (`dateDim`,`accountDim`,`execCodeDim`,`operationTypeDim`,`junkDim`,`ipCountryDim`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

I fill both table with exactly the same data (near 10 000 000)

But:

  • stats table use a DATETIME for dateDim
  • stats_todays use un INTEGER with TO_DAYS() for dateDim

My question is: why MySQL does'nt USE the PRIMARY KEY when the first part of the index is a datetime ???
It is very strange since With the same data but consolidated with an INTEGER and TO_DAYS(dateDim) the same request rocks….

Example with stats table (and datetime):

SELECT * 
FROM `stats`  
WHERE 
   dateDim = '2014-04-03 00:00:00' 
   AND accountDim = 4
   AND execCodeDim = 9
   AND operationTypeDim = 1
   AND junkDim = 5
   AND ipCountryDim = 3

=> 1 result (4.5sec)

Explain:

id  select_type     table   type    possible_keys   key     key_len     ref     rows           Extra
1   SIMPLE          stats   ALL           NULL     NULL       NULL      NULL    8832329     Using where

Same request on the other table stats_todays (With INTEGER and TO_DAYS() )

EXPLAIN SELECT * 
FROM `stats_todays`  
WHERE 
   dateDim = TO_DAYS('2014-04-03 00:00:00')
   AND accountDim = 4
   AND execCodeDim = 9
   AND operationTypeDim = 1
   AND junkDim = 5
   AND ipCountryDim = 3

=> Result 1 row (0.0003 sec) 

Explain:

id  select_type     table          type     possible_keys   key     key_len     ref                               rows  Extra
1   SIMPLE         stats_todays     const   PRIMARY     PRIMARY     13  const,const,const,const,const,const     1    

If you read the full post, you understand that is not a low cardinality problem since the request work with exactly the same cardinality with an INTEGER dateDim field….

Here is some advanced details:

SELECT COUNT( DISTINCT dateDim )
FROM stats_todays
UNION ALL
SELECT COUNT( DISTINCT dateDim )
FROM stats;

Result:


COUNT(DISTINCT dateDim)
2192
2192

Here is the INDEX description:

SHOW INDEXES FROM `stats` 

Table   Non_unique  Key_name    Seq_in_index    Column_name     Collation   Cardinality     Sub_part    Packed  Null    Index_type  Comment     Index_comment
stats   0            PRIMARY          1         dateDim           A     6921           NULL                 NULL        BTREE        
stats   0            PRIMARY          2         accountDim        A     883232         NULL                 NULL        BTREE        
stats   0            PRIMARY          3         execCodeDim       A     8832329     NULL                NULL        BTREE        
stats   0            PRIMARY          4         operationTypeDim  A     8832329     NULL                NULL        BTREE        
stats   0            PRIMARY          5         junkDim           A     8832329     NULL                NULL        BTREE        
stats   0            PRIMARY          6         ipCountryDim      A     8832329     NULL                NULL        BTREE       

SHOW INDEXES FROM `stats_todays` 

Table   Non_unique  Key_name    Seq_in_index    Column_name     Collation   Cardinality     Sub_part    Packed  Null    Index_type  Comment     Index_comment
stats_todays    0   PRIMARY     1              dateDim              A        7518   NULL                   NULL         BTREE        
stats_todays    0   PRIMARY     2              accountDim           A        4022582    NULL                   NULL         BTREE        
stats_todays    0   PRIMARY     3              execCodeDim          A        8045164    NULL                   NULL         BTREE        
stats_todays    0   PRIMARY     4              operationTypeDim     A        8045164    NULL                   NULL         BTREE        
stats_todays    0   PRIMARY     5              junkDim              A        8045164    NULL                   NULL         BTREE        
stats_todays    0   PRIMARY     6              ipCountryDim         A        8045164    NULL                   NULL         BTREE        

SELECT dateDim, COUNT(*) FROM stats GROUP BY dateDim WITH ROLLUP

  • tells there is 2192 different dates, and the repartition is smooth (approx 3000 – 4000 rows by date)
  • there is 8 831 990 rows in the table
  • The same for the other table
  • I tried with COVERING INDEX (replacing * by all PK columns) => nothing changed
  • I tried force|use index => nothing changed
  • The same with date field instead datetime
  • The same with INDEX or UNIQUE instead of primary key

Best Answer

This is a bug in 5.5.x. See here

That suggests that your query should be

SELECT * 
FROM `stats`  
WHERE 
   dateDim = CAST('2014-04-03 00:00:00' as datetime)
   AND accountDim = 4
   AND execCodeDim = 9
   AND operationTypeDim = 1
   AND junkDim = 5
   AND ipCountryDim = 3