Mysql – Slow MySQL query when the IN list is long

MySQLmysql-5.6performancequery-performance

I have 3 tables, namely: product, product_has_category, category:

CREATE TABLE IF NOT EXISTS `product` (
`id` int(11) NOT NULL,
  `name` varchar(255) DEFAULT NULL,
  `des` text COMMENT 'description',
  `price` decimal(10,2) DEFAULT NULL COMMENT 'Price of product',
  `vendor` varchar(255) DEFAULT NULL COMMENT 'Where this product is from , can be derived from p_url field',
  `p_url` text,
  `hide_flag` int(11) NOT NULL COMMENT 'When you dont want to show a product but dont want to delete it either. 0 = not hidden, 1 = hidden',
  `del_flag` int(11) NOT NULL COMMENT 'Delete Flag 1 = deleted , 0 = not deleted'
) ENGINE=InnoDB AUTO_INCREMENT=58897 DEFAULT CHARSET=latin1;
ALTER TABLE `product`
 ADD PRIMARY KEY (`id`), ADD KEY `del_flag` (`del_flag`,`hide_flag`), ADD KEY `name` (`del_flag`,`name`);

CREATE TABLE IF NOT EXISTS `product_has_category` (
`id` int(11) NOT NULL,
  `product_id` int(11) NOT NULL,
  `category_id` int(11) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=78705 DEFAULT CHARSET=latin1;
ALTER TABLE `product_has_category`
 ADD PRIMARY KEY (`id`), ADD KEY `fk_product_has_category_category1_idx` (`category_id`), ADD KEY `fk_product_has_category_product1_idx` (`product_id`);

CREATE TABLE IF NOT EXISTS `category` (
`id` int(11) NOT NULL,
  `name` varchar(255) NOT NULL,
  `des` text COMMENT 'Description field',
  `lft` int(11) NOT NULL,
  `rgt` int(11) NOT NULL,
  `parent_id` int(11) NOT NULL COMMENT 'parent_id , allows for hybrid nested set model to speed up search where depth = 1 only'
) ENGINE=InnoDB AUTO_INCREMENT=1383 DEFAULT CHARSET=latin1;
 ADD PRIMARY KEY (`id`), ADD KEY `category_lft_idx` (`lft`);

I have a query to look up products when they match the category that the user is searching for.

Each product can be linked to multiple categories, also the number of categories that will be defined in the query is based on what the user chooses.

As such my query has become slow due to number of category ids that are being checked.

My query is as follows:

SELECT DISTINCT
    `product`.`id`, `product`.`name`, `product`.`des`, `product`.`del_flag`,
    `product`.`vendor`, `product`.`price`, `product`.`p_url`
FROM `product`
INNER JOIN `product_has_category` as `product_category`
    ON `product_category`.`product_id` = `product`.`id`
WHERE `product`.`del_flag` = '0'
AND `product_category`.`category_id` IN
(
    '189', '190', '191', '192', '193', '194', '195', '196', '197', '198',
    '199', '200', '331', '332', '333', '334', '335', '336', '337', '338',
    '339', '340', '207', '208', '209', '210', '211', '212', '213', '214',
    '215', '216', '217', '218', '219', '220', '221', '222', '223', '224',
    '225', '226', '227', '228', '229', '230', '231', '232', '233', '234',
    '235', '236', '237', '238', '239', '240', '241', '242', '243', '244',
    '245', '246', '247', '248', '249', '250', '251', '252', '253', '254',
    '255', '256', '257', '258', '259', '260', '261', '262', '263', '264',
    '265', '282', '283', '284', '285', '286', '287', '288', '267', '268',
    '269', '270', '271', '272', '273', '274', '275', '276', '277', '278',
    '279', '280', '281', '289', '290', '291', '292', '293', '294', '295',
    '308', '309', '310', '311', '312', '313', '314', '315', '316', '298',
    '299', '300', '301', '302', '303', '304', '305', '306', '307', '323',
    '324', '325', '326', '327', '328', '329', '318', '319', '320', '321',
    '322', '341', '342', '343', '344', '345', '346', '347', '348', '349',
    '350', '351', '352', '353', '354', '355', '356', '357', '358', '374',
    '375', '376', '377', '378', '379', '380', '381', '382', '360', '361',
    '362', '383', '384', '385', '386', '387', '388', '364', '389', '390',
    '391', '392', '366', '393', '394', '395', '368', '369', '370', '396',
    '397', '398', '399', '400', '401', '402', '403', '404', '372', '405',
    '406', '409', '410', '411', '412', '413', '414', '415', '416', '417',
    '418', '419', '420', '421', '422', '423', '424', '425', '426', '427',
    '428', '429', '430', '431', '432', '433', '434', '435', '436', '437',
    '438', '439', '440', '441', '442', '443', '452', '453', '454', '455',
    '456', '457', '458', '459', '460', '461', '462', '463', '464', '465',
    '466', '447', '448', '449', '450', '451', '467', '468', '469', '470',
    '471', '472', '475', '476', '477', '478', '479', '480', '481', '482',
    '483', '484', '485', '486', '487'
)
ORDER BY `name` asc

MySQL explain shows me the following:

mysqlexplain

This problem did not exists when the query was small.

EXPLAIN SELECT DISTINCT `product`.`id`, `product`.`name`, `product`.`des`, `product`.`del_flag`, `product`.`vendor`, `product`.`price`, `product`.`p_url`
FROM `product`
INNER JOIN `product_has_category` as `product_category` ON `product_category`.`product_id` = `product`.`id`
WHERE `product`.`del_flag` = '0'
AND `product_category`.`category_id` IN('189',  '487')
ORDER BY `name` asc

When the number of categories to search was small mysql explain showed the following:

mysqlexplain2

I'm not a DBA so i would appreciate any help to help me optimize this code to better function.

Best Answer

It is more efficient to do a table scan rather than use an index in the case where you need to touch more than 20% of the rows. The optimizer knows this and uses it to good advantage. (The "20%" varies with the phase of the moon.)

Don't use a 4-byte INT for a flag; use a TINYINT UNSIGNED.

In product_has_category, don't have an id, instead use

PRIMARY KEY(product_id, category_id),
INDEX(      category_id, product_id)

That will make using the table more efficient.

(When showing CREATE TABLE, include the indexes inside, not as an afterthought. Or simply do SHOW CREATE TABLE for displaying here.)