MySQL – Troubleshooting DELETE Statement Hanging Issues

deleteinnodbMySQL

The following command:

DELETE FROM hours WHERE id IN (SELECT id FROM holdkey WHERE persons_id=60)

seems to hang indefinitely. However, the corresponding SELECT * command completes in a fraction of a second, see 'explain' command below.

Deleting rows individually is fast too.

Since the field 'id' is the primary key, I don't see any reason for the DELETE command to take any significant amount of time.

Is there an explanation and/or a work-around?

The table 'hours' contains approx 1 million rows, and 161 rows are to be deleted using the above command.

I'm using MySQL version 5.5 with the InnoDB engine. The server is completely idle.

Removing the foreign key from the 'hours' table and then changing the table engine to MyISAM makes no difference.

The table hours looks like this:

MariaDB [MSG_PLAN]> show create table hours \G;
*************************** 1. row ***************************
       Table: hours
Create Table: CREATE TABLE `hours` (
  `id` int(6) NOT NULL AUTO_INCREMENT,
  `persons_id` int(6) NOT NULL,
  `year_nr` int(6) NOT NULL,
  `week_nr` int(6) NOT NULL,
  `week_day` int(6) NOT NULL,
  `activity_id` int(6) NOT NULL,
  `hours` decimal(3,1) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id` (`id`),
  KEY `persons_id` (`persons_id`),
  CONSTRAINT `hours_ibfk_1` FOREIGN KEY (`persons_id`) REFERENCES `persons` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1057141 DEFAULT CHARSET=latin1

MariaDB [MSG_PLAN]> EXPLAIN SELECT *  FROM hours WHERE id IN (SELECT id FROM holdkey WHERE persons_id=60);
+------+--------------+-------------+--------+---------------+---------+---------+---------------------+------+-------------+
| id   | select_type  | table       | type   | possible_keys | key     | key_len | ref                 | rows | Extra       |
+------+--------------+-------------+--------+---------------+---------+---------+---------------------+------+-------------+
|    1 | PRIMARY      | <subquery2> | ALL    | distinct_key  | NULL    | NULL    | NULL                |  851 |             |
|    1 | PRIMARY      | hours       | eq_ref | PRIMARY,id    | PRIMARY | 4       | MSG_PLAN.holdkey.id |    1 |             |
|    2 | MATERIALIZED | holdkey     | ALL    | NULL          | NULL    | NULL    | NULL                |  851 | Using where |
+------+--------------+-------------+--------+---------------+---------+---------+---------------------+------+-------------+

MariaDB [MSG_PLAN]> show create table holdkey \G;
*************************** 1. row ***************************
       Table: holdkey
Create Table: CREATE TABLE `holdkey` (
  `id` int(6) NOT NULL DEFAULT '0',
  `persons_id` int(6) NOT NULL,
  `year_nr` int(6) NOT NULL,
  `week_nr` int(6) NOT NULL,
  `week_day` int(6) NOT NULL,
  `activity_id` int(6) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Best Answer

You found a (known) limitation of the optimizer of MySQL.

The problem with DELETE FROM table WHERE column IN (SELECT ...) is that the subquery (SELECT) is executed for each row of the table that has a different value for column. In your case, as it is a PRIMARY KEY, it is executed a million times. More details in http://dev.mysql.com/doc/refman/5.6/en/subquery-restrictions.html.

The reason explaining why the SELECT * FROM table WHERE column IN (SELECT ...) is faster is that the optimizer is rewriting the query as a JOIN. You can see that as the EXPLAIN results of the 2 queries are different (MySQL 5.6.21):

mysql> EXPLAIN SELECT * FROM test_jfg.hours WHERE id IN (SELECT id FROM test_jfg.holdkey WHERE persons_id=60);
+----+-------------+---------+--------+---------------+---------+---------+---------------------+------+------------------------------+
| id | select_type | table   | type   | possible_keys | key     | key_len | ref                 | rows | Extra                        |
+----+-------------+---------+--------+---------------+---------+---------+---------------------+------+------------------------------+
|  1 | SIMPLE      | holdkey | ALL    | NULL          | NULL    | NULL    | NULL                |  920 | Using where; Start temporary |
|  1 | SIMPLE      | hours   | eq_ref | PRIMARY,id    | PRIMARY | 4       | test_jfg.holdkey.id |    1 | End temporary                |
+----+-------------+---------+--------+---------------+---------+---------+---------------------+------+------------------------------+
2 rows in set (0.00 sec)

mysql> EXPLAIN DELETE FROM test_jfg.hours WHERE id IN (SELECT id FROM test_jfg.holdkey WHERE persons_id=60);
+----+--------------------+---------+------+---------------+------+---------+------+--------+-------------+
| id | select_type        | table   | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+--------------------+---------+------+---------------+------+---------+------+--------+-------------+
|  1 | PRIMARY            | hours   | ALL  | NULL          | NULL | NULL    | NULL | 997457 | Using where |
|  2 | DEPENDENT SUBQUERY | holdkey | ALL  | NULL          | NULL | NULL    | NULL |    920 | Using where |
+----+--------------------+---------+------+---------------+------+---------+------+--------+-------------+
2 rows in set (0.00 sec)

According to http://dev.mysql.com/doc/refman/5.6/en/explain-output.html#explain_select_type, DEPENDENT SUBQUERY means:

For DEPENDENT SUBQUERY, the subquery is re-evaluated only once for each set of different values of the variables from its outer context.

To solve that, I do not see other solutions that doing 2 (or more) queries:

  1. Get the results from the SELECT,
  2. If the resultset is small, do a DELETE ... IN (explicit list)

If the resultset is large, you will need to do many DELETE like above with a subset of the list.