I’ve been researching how to optimize only fragmented tables in MySQL and reviewed this post on optimizing tables. It basically performs a query against the information_schema database for any table with data_free > 0
and builds a SQL statement to OPTIMIZE
only those tables. I ran this query and it identified 148 tables for optimization. All of the tables identified are InnoDB tables. After executing the resultant optimization SQL script, I re-ran the original script to identify fragmented tables and it returned the exact same tables during the first pass.
I have seen conflicting posts regarding the InnoDB tables and the OPTIMIZE
command. Some say that OPTIMIZE
will not work with InnoDB tables and that you need to run ALTER TABLE table_name ENGINE=INNODB
. Others say that OPTIMIZE
actually calls the ALTER TABLE
command when executing against InnoDB tables. With that in mind, I ran the ALTER TABLE
command against one of the InnoDB tables identified as being fragmented (data_free > 0
) and found that the data_free
did not change afterwards. It’s still greater than 0. I also re-started MySQL and checked it only to find the same results.
Now, we have several servers running MySQL 5.5.29 in our organization and I ran a query against all of them to identify any InnoDB tables with DATA_FREE=0 or NULL
and none were returned. They are all greater than zero.
I also ran the OPTIMIZE
command against a few MyISAM
tables where DATA_FREE
was greater than zero and verify that it was zero afterwards.
Can anyone shed some light on this for me? What is the proper method to remove fragmentation from InnoDB tables? What is the proper method to determine fragmented InnoDB tables?
Thanks
Best Answer
I'll assume you are using
innodb_file_per_table
for this answer.There is more than one meaning to "InnoDB fragmentation":
.ibd
file is fragmented, and is very large whereas the dataset is smallPlease consider this post I wrote a while back: it shows how after purging many rows from a large table, the data file is fragmented (i.e. it is very large in the filesystem -- it's a known issue these files never reduce in size). And yet the indexes were not fragmented by the end of deletion: this is because InnoDB properly merges pages as they become empty(er).
The
OPTIMIZE
command indeed does not apply on InnoDB. What it does is rebuild the table (exactly like anALTER
). See this:As for
DATA_FREE
: I suggest that you simply ignore this variable. To be honest, I've been working with InnoDB tables for10
years, and have never found this value to be very consistent with anything.And now it's time for the real discussion: what exactly are you trying to achieve? Unless your database is completely stale, there will always be some fragmentation. It is natural to the process of adding, removing and updating rows in your table.
Fragmentation is not that evil: free space can get reclaimed by new data. If you tables are not very large, then just forget about the whole thing. For very large tables, you might gain some disk space by optimizing the table. But ask yourself: how soon would the table reach same fragmentation? An hour? A day? A week? IMHO in all these cases it is pointless to optimize the table.
Nevertheless, if a large table is massively purged of data, which is not expected to return, I'm all for optimizing it. Say you realize you have some redundant data which consists some 30% of your table size. Sure, it would be great to have that disk space back.
Bottom line: only consider these issues with very large tables; only if you have issues with disk space.