This sounds a lot like another post I wrote back on August 27, 2012 : How To Optimize and Repair InnoDB tables? ALTER and OPTIMIZE table failed
With innodb_file_per_table disabled, the system tablespace ibdata1
is the home of seven classes of information:
- Data Pages for InnoDB Tables
- Index Pages for InnoDB Tables
- Data Dictionary
- Double Write Buffer
- Safety Net to Prevent Data Corruption
- Helps Bypass OS for Caching
- Insert Buffer (Streamlines Changes to Secondary Indexes)
- Rollback Segments
- Undo Logs
- Click Here to see a Pictorial Representation of
ibdata1
With innodb_file_per_table disabled, running OPTIMIZE TABLE against any InnoDB table is really asking for trouble. Why? Because all OPTIMIZE TABLE does is write all data pages and index pages for a given table contiguously in ibdata1
. That makes ibdata1
grow. In light of this, shrinkage of ibdata1
is totally impossible.
Please note that even if innodb_file_per_table were enabled and you ran OPTIMIZE TABLE, that will extract the table into an external file. Yet, the space left behind is unrecoverable.
I wrote a nice one-time Cleanup Procedure of ibdata1 in StackOverflow back on Oct 29, 2010.
Here are my other posts on this subject of InnoDB and its effects on ibdata1
SUMMARY
Even if you implement the InnoDB Cleanup (which separates all data and index pages from ibdata1
), ibdata1
can still grow in a heavy-write, heavy transaction environment due to the 5 other classes of information (Data Dictionary, Double Write Buffer, Insert Buffer, Rollback Segments, Undo Logs).
If you want to know how much space is used by data and index pages in ibdata1, run this:
SELECT InnoDB_Bytes,InnoDB_Bytes/POWER(1024,3) InnoDB_GB
FROM (SELECT SUM(data_length+index_length) InnoDB_Bytes
FROM information_schema.tables WHERE engine='InnoDB') A;
Now as for ibdata1, it still the Data Dictionary, Double Write Buffer, Insert Buffer, Rollback Segments, Undo Logs, and unused pages due to fragmentation. There is no really way to know:
- the exact amount of fragmentation
- how much space is used by the following (since they change rapidly):
- Data Dictionary
- Double Write Buffer
- Insert Buffer
- Rollback Segments
- Undo Logs
For the sake of simplicity, I will say this: Just subtract InnoDB_Bytes
from the filesize of ibdata1.
After doing the InnoDB Cleanup, you should schedule running OPTIMIZE TABLE
on every InnoDB table that is transaction-heavy. That will actually shrink the .ibd
file for every InnoDB tables. The ibdata1
file will grow much slower thereafter. Yet, you will still have to live with some transactional growth.
If you implement InnoDB Cleanup but leave innodb_file_per_table disabled, that will shrink shrink ibdata1
, but it will just climb back to 99G and beyond through normal application usage.
If you do not implement InnoDB Cleanup, no OPTIMIZE TABLE
will ever shrink ibdata1
.
RECOMMENDATION
Please implement InnoDB Cleanup with innodb_file_per_table enabled. Going forward, you should create a cronjob to run OPTIMIZE TABLE
on all tables what experience mass INSERTs, mass UPDATEs, and mass DELETEs.
DISKSPACE FOR EVERYTHING INNODB
SELECT FORMAT(SUM(data_length+index_length)/POWER(1024,3),2) InnoDB_DiskSpace
FROM information_schema.tables
WHERE engine='InnoDB';
DISKSPACE FOR DATABASE mydb BY TABLE
SELECT
IFNULL(tbl,'Total') table_name,
FORMAT(SUM(table_bytes)/POWER(1024,3),2) table_size
FROM
(
SELECT table_name tbl,SUM(data_length+index_length)table_bytes
FROM information_schema.tables WHERE table_schema='mydb
GROUP BY table_name WITH ROLLUP
) A ORDER BY ISNULL(tbl) DESC,table_bytes;
DISKSPACE FOR INDIVIDUAL TABLE (mydb.mytable)
SELECT FORMAT(SUM(data_length+index_length)/POWER(1024,3),2) Table_Diskspace
FROM information_schema.tables
WHERE table_schema='mydb;
AND table_name='mytable';
These queries will tell you how much space you need for any table, database, and engine.
SUGGESTION #1
If you would like to simulate OPTIMIZE TABLE
just to see diskspace usage, you run the following steps on mydb.mytable
use mydb
CREATE TABLE mytable_new LIKE mytable;
INSERT INTO mytable_new SELECT * FROM mytable;
ANALYZE TABLE mytable_new;
SELECT FORMAT(SUM(data_length+index_length)/POWER(1024,3),2) Table_Diskspace
FROM information_schema.tables
WHERE table_schema='mydb;
AND table_name='mytable_new';
#
# Stop Here. If you are happy with the size, do the last two commands
# If you prefer, drop table mytable_new and do do the online schema change
#
RENAME TABLE mytable TO mytable_old,mytable_newTO mytable;
DROP TABLE mytable_old;
Doing this can give an idea who big the .ibd
file will be after running OPTIMIZE TABLE
.
SUGGESTION #2
Please do not convert the table to MyISAM. It will be building two files (.MYD
and .MYI
). Then, you try to converting to InnoDB again. It is essentially the same as doing #SUGGESTION #1
. MyISAM is also a little more prone to corruption with no recovery processes in place.
EPILOGUE
By all means, you can proceed with pt-online-schema-change if your data is live. My answer simply gives you queries to estimate sizes of the tables and approximate temp table sizes for OPTIMIZE TABLE
.
If you are doing this during off hours, try my post Recover the disk space after deleting rows from table
GIVE IT A TRY !!!
Best Answer
Here is something to consider: you need to determine how much fragmentation exists before deciding to defragment the table or not.
The most efficient way is to measure the space usage from the OS and from INFORMATION_SCHEMA
For this example, assume the following
/var/lib/mysql
mydb
mytable
OS
INFORMATION_SCHEMA
Simply subtract @table_size_inf from TABLE_SIZE_OS to get the number of bytes unused
Since you deleted 85.7142 % of the table (60GB out of 70GB), you could then run
innodb_file_per_table is disabled
You could do this against an entire database
I have suggested stuff like this before
Apr 11, 2012
: How do you remove fragmentation from InnoDB tables?Jul 03, 2013
: Information about Disk Storage MySQLUPDATE 2017-01-31 07:40 EST
You just asked the following
My answer to that is Yes. Deleted rows are simply marked and are available for reuse.
However, each 16K block would have its own internal fragmentation. Even my answer does not compute the granularity of fragmentation within 16K blocks. The more sparse a 16K block is due to deleted space, the more disk I/O InnoDB will experience in traversing 16K blocks in search of free space wide enough for row insertion.
If you are satisfied with the rate of your bulk insertions, then there is no need to
OPTIMIZE TABLE
that one table. If your index statistics against that table make SELECT queries bad, just runANALYZE TABLE
instead ofOPTIMIZE TABLE
.