You should let the table rollback fully. If you kill the mysqld process, crash recovery will happen anyway.
As far as the DELETE query goes, you may want to execute it in chunks or all-in-one. I would like to suggest an alternate method for deletion. Here is a DELETE JOIN method:
- Create a Table to Collect Primary Keys for Rows to DELETE from
conv_with
- Perform a JOIN and DELETE simultaneously
Here is the SQL for it
CREATE TABLE conv_with_delete SELECT id FROM conv_with WHERE 1=2;
ALTER TABLE conv_with_delete ADD PRIMARY KEY (id);
INSERT INTO conv_with_delete
SELECT id FROM conv_with
WHERE user_id_from IN (uid1, uid2, ..., uid100000);
DELETE B.* FROM conv_with_delete A INNER JOIN conv_with B USING (id);
You can either load conv_with_delete
with all 20M ids, or cycle through one user_id_from
at a time. Perhaps a stored procedure might work for you.
First let's make some sample data:
DROP DATABASE IF EXISTS noam;
CREATE DATABASE noam;
USE noam
CREATE TABLE conv_load (user_id_from int not null) ENGINE=MyISAM;
INSERT INTO conv_load (user_id_from) VALUES
( 0),( 1),( 2),( 3),( 4),( 5),( 6),( 7),( 8),( 9),
(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),
(20),(21),(22),(23),(24),(25),(26),(27),(28),(29);
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
SELECT COUNT(1) FROM conv_load;
CREATE TABLE conv_with
(
id int not null auto_increment,
user_id_from int not null,
primary key (id),
key `from` (user_id_from)
) ENGINE=InnoDB;
INSERT INTO conv_with (user_id_from) SELECT * FROM conv_load;
Here is the creation of the sample data:
mysql> DROP DATABASE IF EXISTS noam;
Query OK, 2 rows affected (0.33 sec)
mysql> CREATE DATABASE noam;
Query OK, 1 row affected (0.00 sec)
mysql> USE noam
Database changed
mysql> CREATE TABLE conv_load (user_id_from int not null) ENGINE=MyISAM;
Query OK, 0 rows affected (0.13 sec)
mysql> INSERT INTO conv_load (user_id_from) VALUES
-> ( 0),( 1),( 2),( 3),( 4),( 5),( 6),( 7),( 8),( 9),
-> (10),(11),(12),(13),(14),(15),(16),(17),(18),(19),
-> (20),(21),(22),(23),(24),(25),(26),(27),(28),(29);
Query OK, 30 rows affected (0.00 sec)
Records: 30 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 30 rows affected (0.00 sec)
Records: 30 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 60 rows affected (0.00 sec)
Records: 60 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 120 rows affected (0.00 sec)
Records: 120 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 240 rows affected (0.00 sec)
Records: 240 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 480 rows affected (0.00 sec)
Records: 480 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 960 rows affected (0.00 sec)
Records: 960 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 1920 rows affected (0.00 sec)
Records: 1920 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 3840 rows affected (0.00 sec)
Records: 3840 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 7680 rows affected (0.00 sec)
Records: 7680 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 15360 rows affected (0.01 sec)
Records: 15360 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 30720 rows affected (0.01 sec)
Records: 30720 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 61440 rows affected (0.02 sec)
Records: 61440 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 122880 rows affected (0.04 sec)
Records: 122880 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 245760 rows affected (0.07 sec)
Records: 245760 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 491520 rows affected (0.15 sec)
Records: 491520 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_load (user_id_from) SELECT user_id_from FROM conv_load;
Query OK, 983040 rows affected (0.30 sec)
Records: 983040 Duplicates: 0 Warnings: 0
mysql> SELECT COUNT(1) FROM conv_load;
+----------+
| COUNT(1) |
+----------+
| 1966080 |
+----------+
1 row in set (0.00 sec)
mysql> CREATE TABLE conv_with
-> (
-> id int not null auto_increment,
-> user_id_from int not null,
-> primary key (id),
-> key `from` (user_id_from)
-> ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.09 sec)
mysql> INSERT INTO conv_with (user_id_from) SELECT * FROM conv_load;
Query OK, 1966080 rows affected (25.41 sec)
Records: 1966080 Duplicates: 0 Warnings: 0
mysql>
OK, the table has 1,966,080 rows. Let's go about deleting anything whose user_id_from is 13 or 23.
mysql> CREATE TABLE conv_with_delete SELECT id FROM conv_with WHERE 1=2;
Query OK, 0 rows affected (0.23 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE conv_with_delete ADD PRIMARY KEY (id);
Query OK, 0 rows affected (0.22 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> INSERT INTO conv_with_delete
-> SELECT id FROM conv_with
-> WHERE user_id_from IN (13,23);
Query OK, 131072 rows affected (2.39 sec)
Records: 131072 Duplicates: 0 Warnings: 0
mysql> DELETE B.* FROM conv_with_delete A INNER JOIN conv_with B USING (id);
Query OK, 131072 rows affected (8.22 sec)
mysql>
OK so the DELETE JOIN
works in practice manually executed.
OK now for the Stored Procedure to Encapsulate This
DELIMITER $$
DROP PROCEDURE IF EXISTS `noam`.`BulkZap` $$
CREATE PROCEDURE `noam`.`BulkZap` (ListOfIDs VARCHAR(1024))
BEGIN
DROP TABLE IF EXISTS conv_with_delete;
CREATE TEMPORARY TABLE conv_with_delete SELECT id FROM conv_with WHERE 1=2;
ALTER TABLE conv_with_delete ADD PRIMARY KEY (id);
SET @sql = CONCAT('INSERT INTO conv_with_delete SELECT id ',
'FROM conv_with WHERE user_id_from IN (',ListOfIDs,')');
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
DELETE B.* FROM conv_with_delete A INNER JOIN conv_with B USING (id);
END $$
DELIMITER ;
Let's Test it with :
- Count before the
DELETE JOIN
- Execute the
DELETE JOIN
- Count after the
DELETE JOIN
Here we go:
mysql> SELECT COUNT(1) FROM conv_with;
+----------+
| COUNT(1) |
+----------+
| 1966080 |
+----------+
1 row in set (0.50 sec)
mysql> CALL BulkZap('13,23');
Query OK, 131072 rows affected (5.49 sec)
mysql> SELECT COUNT(1) FROM conv_with;
+----------+
| COUNT(1) |
+----------+
| 1835008 |
+----------+
1 row in set (0.48 sec)
mysql>
Hey, not bad. Deleted 131,072 rows in 5.49 seconds. You need to use the stored procedure passing in a comma-separated list of used_id_from
values.
Give it a Try !!!
Best Answer
When mysql alters a table, it essentially makes a copy of of it and then swaps the copy in. This way if you cancel the update in the middle, the table is still in a stable state. So you can look in the mysql data directory (/var/lib/mysql/ ?) to see how large the new file is, that will tell you how far along it is. This is a bit more difficult with Innodb, but there is a tmp table being created somewhere.
You can significantly reduce the amount of a time an index takes by increasing your sort buffer variables (myisam_sort_buffer_size, sort_buffer_size). Make those as large as you can with memory you have. You could cut a few days off of the modification time, even get it down to a few hours depending on how much memory you have. I've done a 150M record table in about 3 hours.