Mysql – How to improve InnoDB DELETE performance

deleteinnodbMySQLperformance

So I have this audit table (tracks actions on any table in my database):

CREATE TABLE `track_table` (
  `id` int(16) unsigned NOT NULL,
  `userID` smallint(16) unsigned NOT NULL,
  `tableName` varchar(255) NOT NULL DEFAULT '',
  `tupleID` int(16) unsigned NOT NULL,
  `date_insert` datetime NOT NULL,
  `action` char(12) NOT NULL DEFAULT '',
  `className` varchar(255) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `userID` (`userID`),
  KEY `tableID` (`tableName`,`tupleID`,`date_insert`),
  KEY `actionDate` (`action`,`date_insert`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

and I need to start archiving outdated items. The table has grown to about 50million rows, so the fastest way I could delete the rows was to delete it a table at a time (based on tableName).

This works pretty well but on some of the tables that are write-heavy, it won't complete. My query deletes all items that have an associated delete action on a tupleID/tableName combination:

DELETE FROM track_table WHERE tableName='someTable' AND tupleID IN (
  SELECT DISTINCT tupleID FROM track_table
  WHERE tableName='someTable' AND action='DELETE' AND date_insert < DATE_SUB(CURDATE(), INTERVAL 30 day)
)

I let this run on my server for 3 days and it never completed for the largest table. The explain output (if I switch the delete to select:

| id | select_type        | table       | type | possible_keys      | key     | key_len | ref        | rows    | Extra                        |
|  1 | PRIMARY            | track_table | ref  | tableID            | tableID | 257     | const      | 3941832 | Using where                  |
|  2 | DEPENDENT SUBQUERY | track_table | ref  | tableID,actionDate | tableID | 261     | const,func |       1 | Using where; Using temporary |

So 4 million rows shouldn't take 3 days to delete, I would think. I have my innodb_buffer_pool_size set to 3GB, and the server is not set to use one_file_per_table. What other ways can I improve InnoDB delete performance? (Running MySQL 5.1.43 on Mac OSX)

Best Answer

You could delete data in batches.

In SQL Server, the syntax is delete top X rows from a table. You then do it in a loop, with a transaction for each batch (if you have more than one statement, of course), so to keep transactions short and maintain locks only for short periods.

In MySQL syntax: DELETE FROM userTable LIMIT 1000

There are restrictions on that (can't use LIMIT in deletes with joins, for instance) but in this case you might be able to do it that way.

There is an additional danger to using LIMIT with DELETE when it comes to replication; the rows deleted are sometimes not deleted in the same order on the slave as it was deleted on the master.