I have a department table with the following structure
Create Table
CREATE TABLE `department` (
`id` binary(16) NOT NULL,
`name` varchar(255) NOT NULL,
`type` int(11) NOT NULL,
`status` tinyint(4) NOT NULL, -- Possible values are 0,1,2 only
PRIMARY KEY (`id`),
UNIQUE KEY `UK_2xsp2nild3xbgkg4pln7cviib` (`status`,`type`,`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Following query doesn't utilise the status
index as it is the left most column in unique constraint.
EXPLAIN SELECT * FROM department d WHERE d.status = 1;
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
------ ----------- ------ ---------- ------ ---------------------------- ------ ------- ------ ------ -------- -------------
1 SIMPLE d (NULL) ALL UK_2xsp2nild3xbgkg4pln7cviib (NULL) (NULL) (NULL) 4 100.00 Using where
But when querying on another table which has unique constraint on a single column, it uses the index.
CREATE TABLE `account_unit` (
`id` binary(16) NOT NULL,
`status` tinyint(4) NOT NULL,
`unit_name` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UK_2x8b55sxftahu8cjbqld9sw9b` (`unit_name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
EXPLAIN SELECT * FROM account_unit au WHERE au.`unit_name` = 'AU1'
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
------ ----------- ------ ---------- ------ ---------------------------- ---------------------------- ------- ------ ------ -------- --------
1 SIMPLE au (NULL) const UK_2x8b55sxftahu8cjbqld9sw9b UK_2x8b55sxftahu8cjbqld9sw9b 767 const 1 100.00 (NULL)
As per my understanding, left most column on a multi column constraint will be utilized as index on select queries. Enlighten me if I'm wrong.
Best Answer
It's almost certainly due to the low cardinality of the index on the first column. Since there are only 3 values, the optimizer decides that it is better do do a full scan on the table.
Using the index on the first column will point to about one third of the rows. Looking up that many rows based on the primary key will likely lead to practically all of the primary key pages being pulled into the buffer along with a large portion of the secondary index pages, leading to more memory usage and potentially disk traffic. Overall, it is more efficient and usually about the same speed to just do a full scan on the data if a large portion of the data is likely to be selected based on a secondary index lookup.