Mysql – InnoDB “holes” in large table 200 Million Rows

innodbMySQLssdtable

I have a large MySql InnoDB table with about 200 million rows. One of the columns has some data in it that I really don't need any more, so I'm thinking about removing that column, or setting all the cells in that column to NULL.

This column contains a text value of about 300 characters, so by removing it I'll save about 300 bytes x 200 million rows = 60GB.

But, will this leave a lot of "holes" in my data? Can MySql effectively and efficiently re-use this space without some sort of de-fragmentation? Or does that just not matter since it's InnoDB and on an SSD drive?

Thanks!

Update: The table has about columns total, but most of them are INT values. This text field is by far the biggest. Each row only takes up about 4KB of space total.

Best Answer

InnoDB stores data (and indexes) in 16KB blocks.

It sounds like you are shrinking each row by an average of 7% (300/4K).

If the shrinkage had been more than 50%, then probably InnoDB would have noticed adjacent half-full blocks and coalesced them, thereby freeing up some of the predicted savings. (A block is 16KB.) I say only "some" because it is not aggressive at packing the blocks after UPDATEs or DELETEs.

If you had not done the OPTIMIZE, future inserts/deletes/updates, if they were scattered around the table, would have filled in the 'holes' and/or led to blocking coalescing. If, on the other hand, the main activity is to add new rows at the 'end' of the PRIMARY KEY values, then this subsequent cleanup would not happen, and the OPTIMIZE would be desirable.

How much did the OPTIMIZE benefit you? Performance probably changed very little. Sure, you freed up some space, but, if you are not running out of disk space, that does not matter much.

Another issue... Tablespaces. The old default was to put all tables into a single tablespace (the file named ibdata1). It would grow but never shrink. Any freed blocks would be available for future database work, but would not be given back to the OS.

The new default (innodb_file_per_table) has the same problem. However, an explicit OPTIMIZE rebuilds the table, and does give the freed space back to the OS. Caveat: During the process, you need 2x the disk space. So, if you were very tight on space, OPTIMIZE would fail. However...

OPTIMIZE TABLE (for InnoDB) is "Performed in-place as of MySQL 5.6.17. In-place operation is not supported for tables with FULLTEXT indexes. The operation uses the INPLACE algorithm, but ALGORITHM and LOCK syntax is not permitted. " -- https://dev.mysql.com/doc/refman/5.6/en/innodb-online-ddl-operations.html