Mysql – Deleting & Reclaiming space from InnoDB table

innodbMySQL

I have a 700GB InnoDB table which I'm not writing any more data to (only reading).
I would like to delete the older data it holds and reclaim that disk space (as I'm running out of it). The delete part is pretty easy, because I have an auto-inc primary index so I can just iterate in chunks using it, and delete the rows, but that won't bring me back the space. I assume OPTIMIZE TABLE will but that might take forever on a 700GB table, so is there another option I'm overlooking?

Edit by RolandoMySQLDBA

Assuming your table is mydb.mytable, please run the following query and post it here so you can determine diskspace needed for the table's shrinkage:

SELECT
    FORMAT(dat/POWER(1024,3),2) datsize,
    FORMAT(ndx/POWER(1024,3),2) ndxsize,
    FORMAT((dat+ndx)/POWER(1024,3),2) tblsize
FROM (SELECT data_length dat,index_length ndx
FROM information_schema.tables WHERE
table_schema='mydb' AND table_name='mytable') A;

We also need to see the table structure, if allowed.

Edit by Noam

This is the output of the query:

datsize ndxsize tblsize
682.51 47.57 730.08

This is the table structure (SHOW CREATE TABLE)

`CREATE TABLE `mybigtable` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `uid` int(11) NOT NULL,  
  `created_at` datetime NOT NULL,  
  `tid` bigint(20) NOT NULL,  
  `text` varchar(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL, 
  `ft` tinyint(1) NOT NULL,  
  `irtsd` bigint(20) NOT NULL,  
  `irtuid` int(11) NOT NULL,  
  `rc` int(11) NOT NULL,  
  `r` tinyint(1) NOT NULL,  
  `e` text CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,  `timezone` varchar(5) NOT NULL,  PRIMARY KEY (`id`),  UNIQUE KEY `uid_tid` (`uid`,`tid`)) ENGINE=InnoDB AUTO_INCREMENT=2006963844 DEFAULT CHARSET=utf8`

Best Answer

This is a good question. You have several solutions but your table is quite big so none will be without pain :)

You have three solutions to "shrink" InnoDB tables:

1. OPTIMIZE TABLE

You can use OPTIMIZE TABLE as you mentionned it but you should care about the innodb_file_per_table variable :

mysql> show variables like "innodb_file_per_table";
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| innodb_file_per_table | ON    |
+-----------------------+-------+
1 row in set (0.00 sec)

Let me explain:

The OPTIMIZE TABLE with InnoDB tables, locks the table, copies the data in a new clean table (that's why the result is smaller), drops the original table and renames the new table with the original name. That why you should make sure to have twice the space of the original table available on your disk (You'll probably need less, since the optimized table will be smaller, but it's better to be safe than sorry).

innodb_file_per_table = ON : In this mode, all tables have their own data file. The OPTIMIZE statement will then create a new data file with optimized space usage. When the operation is finished, MySQL will drop the original one and replace it with the optimized version (so at the end the 700GB -- probably less because it will be optimized -- of data generated during the operation will be released)

innodb_file_per_table = OFF: In this mode, all data is contained in one data file: ibdata. This mode has a big drawback since it cannot be optimized. So during the OPTIMIZE process, your new table will be created (near 700GB), but even after the drop and renaming operation (and the end of OPTIMIZE phase) your ibdata will not released the ~700GB, so you wanted to free some data, instead you have 700GB more, cool isn't it?

2. ALTER TABLE

You can also use an ALTER TABLE statement, the ALTER TABLE will work in the same way as OPTIMIZE TABLE. You can just use:

ALTER TABLE myTable ENGINE=InnoDB;

3. ALTER TABLE (ONLINE)

The problem of OPTIMIZE and ALTER TABLE is, that it locks the table during operation. You can use the Percona tool : pt-online-schema-change (from Percona Toolkit : link ). pt-online-schema... provide mechanisms to optimize the table, while keeping the original table available for read and writes. I use this tool in production for ALTER statements on big tables and it's pretty cool.

Note that any FOREIGN KEYs referencing your table might complicate things, since locks might lead to locks on other tables and so on. To check this, simply query:

mysql> SELECT COUNT(*) FROM information_schema.REFERENTIAL_CONSTRAINTS WHERE REFERENCED_TABLE_NAME = "myTable";
+----------+
| COUNT(*) |
+----------+
|        0 |
+----------+
1 row in set (0.04 sec)

Here is how I use pt-online-schema-change:

pt-online-schema-change --alter "ENGINE=InnoDB" D=myBase,t=myTable --user --ask-pass

Note that my note on innodb_file_per_table is true also for this solution.

4. mysqldump

The last solution is to recreate all databases from a dump. It takes forever, but it's extremely efficient. Note that this is the only solution to optimize your ibdata file, if innodb_file_per_table is OFF

Max.