OPTIMIZE TABLE performs the following steps internally on a table (mydb.mytable)
CREATE TABLE mydb.mytablenew LIKE mydb.mytable;
INSERT INTO mydb.mytablenew SELECT * FROM mydb.mytable;
ALTER TABLE mydb.mytable RENAME mydb.mytablezap;
ALTER TABLE mydb.mytablenew RENAME mydb.mytable;
DROP TABLE mydb.mytablezap;
Since there is DDL involved, there is no way around queries taking a big performance hit during the operation. Additionally, not performing any optimization would be just as bad.
What you need is to have MySQL Master/Master (aka Circular) Replication set up
You could then try this:
For Servers M1 and M2 and DB VIP pointing to M1
On M2, run the following
STOP SLAVE;
SET sql_log_bin = 0;
Perform OPTIMIZE TABLE or ALTER TABLE ... ENGINE=InnoDB on all InnoDB tables
START SLAVE;
Wait for replication to catch (Seconds_Behind_Master = 0)
The SET sql_log_bin = 0
would prevent the DDL commands from replicating over the Master.
Once those steps are complete, promote the Slave to Master, and demote the Master to Slave (can done by just moving your DB VIP from M1 to M2). You could perform this maintenance every day and production will not feel any effects with the exception of the Master Promotion and Slave Demotion.
You could create the script and run it on M2 like this:
echo "SET sql_log_bin = 0;" > InnoDBCompression.sql
echo "STOP SLAVE;" >> InnoDBCompression.sql
mysql -u... -p... -AN -e"SELECT CONCAT('ALTER TABLE ',table_schema,'.',table_name,' ENGINE=InnoDB;') InnoDBCompressionSQL FROM information_schema.tables WHERE engine='InnoDB' ORDER BY (data_length+index_length)" >> InnoDBCompression.sql
echo "START SLAVE;" >> InnoDBCompression.sql
mysql -u... -p... -A < InnoDBCompression.sql
From here, just wait for Seconds_Behind_Master to be 0 on M2, then move the DBVIP from M1 to M2. Now if you know the specific names of the tables you want optimized, you could adjust the query to fetch just those tables.
Give it a Try !!!
CAVEAT
Here is a fair warning: If you have innodb_file_per_table disbaled, every time you run OPTIMIZE TABLE
or ALTER TABLE ... ENGINE=InnoDB;
the ibdata1 file just grows. You would need to cleanup the InnoDB infrastructure to prevent ibdata1 from growing out of control.
The activity of altering big tables are done in phases:
- Create a new table with required fields and indexes say in test DB (just structure)
- Dump the data from the existing table and load the same to the newly created table in test DB
- Now announce your downtime :)
- Swap the tables by renaming -
RENAME table ur_db.table_name to test.temp, test.table_name to ur_db.table_name, test.temp to test.table_name;
This is an atomic operation taking fraction of a second.
- Load the extra records to the newly created table ( the records that came after dumping and then loading). This step can be done before Step: 3 also once to reduce your downtime.
- And your system is back
Few notes:
- You need not hit information schema directly like this, try using
SHOW TABLE STATUS from db like 'table_name'
- The speed of alter table is more or less linked with the I/O speed. Last time when we ran a direct alter table(without the above steps), we had 40GB+ table size it took around 4hours. If your 20GB data is taking years, you are working on some outdated machine.
- Also drop of the unwanted indexes like
index_bslt_ondate, index_base_schedule_line_items_loan_base_schedule
as some other indexes has the left most column as the indexed column
Please let me know if you need any clarification on any of these steps.
Edit : A simple python script to automate the process https://github.com/georgecj11/hotswap_mysql_table.git
Best Answer
Starting with MySQL 5.6, the
performance_schema
instruments Table I/O, and computes aggregated statistics bytable
, and byindex
.See table
performance_schema.table_io_waits_summary_by_index_usage
:http://dev.mysql.com/doc/refman/5.6/en/table-waits-summary-tables.html#table-io-waits-summary-by-index-usage-table
Finding the least recently used index involves time and timestamps.
The performance schema measure counting I/O against an index. It can be used to find the least often used index, which in practice should be pretty close.
Full example here:
http://sqlfiddle.com/#!9/c129c/4