What is the best way to schedule automatic optimization of tables in a MySQL InnoDB database? Can I use events for example? I have recently had a big performance issue (when querying) with one of the tables which is actually the largest one in my database and frequently being updated. After I run OPTIMIZE on the table, it has solved the problem.
MySQL InnoDB Optimization – Scheduled Table Optimization
innodbMySQLoptimization
Related Solutions
There are situations where InnoDB deadlock would come up when you least expect it. For example, SELECT queries can perform locks on the gen_clust_index, aka the Clustered Index.
Here are three past questions I agressively looked over with @RedBlueThing, the person who asked these questions. @RedBlueThing found work arounds for his questions.
- Reasons for occasionally slow queries?
- Trouble deciphering a deadlock in an innodb status log
- Will these two queries result in a deadlock if executed in sequence?
Just to keep your question in perspective, when you look over these answers (don't look too deeply, even I get dizzy looking at my own convoluted answers) it should be quickly apparent that even SELECT queries can lock InnoDB data. Although autocommit could be enabled (rendering each query its own transaction), a single SQL statement can still be victimized by deadlocks.
You also have special cases of SELECT where you can lock specific rows on demand.
Based on InnoDB Deadlocking link, the sequences of event to cause this situation could theoretically be as follows:
- Your SQL UPDATEs a single row but generates an error
- The UPDATE causes a rollback of the one row
- The row has a lingering lock
Personally, that last statement scares me. It would have been nice for MySQL to inform everyone of this quirk instead of just documenting and hopong you google for it. Yet, that statement is from the MySQL Documentation. (Oh yeah, Oracle owns InnoDB)
Before answering the question, please click here to see the InnoDB Infrastructure Map
Based on innodb_file_per_table being disabled, let's go through your questions:
Q 1: How much fragmentation is allowed before it affects performance?
The system tablespace can grow to the limit of the disk volume.
EXAMPLE: I just answered a question about what do to when the system tablespace reaches the limit of an ext3 disk : How to solve "The table ... is full" with "innodb_file_per_table"?
There may still be some wiggle room inside the system tablespace. However, when the wiggle room dwindles to the point that all 1023 undo logs inside the system tablespace are completely filled and can no longer extend, then you must add a new system tablespace file.
Please note that when I say wiggle room, I am referring to the free space within the system tablespace that must accommodate the following:
- Data Dictionary
- Double Write Buffer (can be disabled but not recommended)
- Insert Buffer (Cached Index Changes in System Tablespace instead of the OS)
- Rollback Segments (1023 slots)
- Undo Logs (referenced from the Rollback Segments)
- Please refer Back to the InnoDB Infrastructure Map
Q 2: Should InnoDB tables even be optimized (some say yes others say no)?
If you run OPTIMIZE TABLE, you basically make the data and index pages contiguous in the system tablespace. This defragments the table and accesses all data and indexes quicker until the fragmentation reappears over time in production use. This can introduce new areas of fragmentation. Again, all that fragmentation can fill up with data and indexes. This will endanger the wiggle room I mentioned before.
Q 3: How do you test for InnoDB fragmentation if the server does not use the “file per table” option?
Back on Aug 27, 2012
, I answered this post : How To Optimize and Repair InnoDB tables? ALTER and OPTIMIZE table failed
I explained there how to get the fragmentation. In essence, you do this:
Goto the OS and run
cd /var/lib/mysql
ls -l ibdata1 | awk '{print $5}'
This gets you the size of ibdata1 in bytes
SELECT (data_length+index_length) InnoDBDataIndexBytes
FROM information_schema.tables WHERE engine='InnoDB';
This gets you the sum total of data and index pages in bytes
Subtract the sum total from ibdata1's total bytes. The difference represents the wiggle room. This space causes fragmentation, but is constantly in use until ibdata1 gets filled.
CAVEAT : When innodb_file_per_table is enabled, I explain how to get the fragmentation of an individual table: Innodb table with many deletes and inserts - is there any disk space wasted?
Q 4: Is fragmentation the only reason to run “optimize table”?
Yes. It is far more beneficial for MyISAM and for InnoDB tables (innodb_file_per_table being enabled). Do this with innodb_file_per_table off and you will just make the system tablespace grow faster. See my post How can Innodb ibdata1 file grows by 5X even with innodb_file_per_table set?
Q 5: If I do need to run “optimize table” on an InnoDB table should I run ALTER TABLE mydb.mytable ENGINE=InnoDB; and not ANALYZE TABLE
Running ALTER TABLE mydb.mytable ENGINE=InnoDB;
would indeed shrink when innodb_file_per_table is enabled. Again, it is not worth when innodb_file_per_table is disabled.
Q 6: Can you selectively tell which innodb tables needs optimizing if the server does not use the "file per table" option?
No you cannot. Why? The INFORMATION_SCHEMA becomes totally useless because all the tables are inside one file. I wrote a script to find to uptime_time (last time an InnoDB table was written) of all InnoDB tables. Is there a way to find the least recently used tables in a schema? script only works for innodb_file_per_table. This shows that you cannot ascentain that fragmentation with ease. You could resort to more aggressive techniques like dumping the tablespace map and located segments with unused space : See this blog post : http://www.markleith.co.uk/2009/01/19/innodb-table-and-tablespace-monitors/. This is way too much firepower to deal with. You could just run the OPTIMIZE TABLE
to eliminate segment fragmentation, but this brings us back full circle to getting everything out of ibdata1
.
SUGGESTION
If you want to remove all data and index pages from ibdata1 and shrink ibdata1 permanently, please read my Oct 29, 2010
StackOverflow post Howto: Clean a mysql InnoDB storage engine?
As you can see, this subject is not new to me
Jan 13, 2013
: How to check growth of database in mysql?Sep 23, 2012
: MySQL / InnoDB: Does daily optimization improve performance?Apr 15, 2012
: What happens when InnoDB hits its tablespace autoextend max?Apr 11, 2012
: How do you remove fragmentation from InnoDB tables?Mar 25, 2012
: Why does InnoDB store all databases in one file?Feb 03, 2012
: Scheduled optimization of tables in MySQL InnoDB
EPILOGUE
Running OPTIMIZE TABLE
is not the biggest reason that ibdata1 grows quickly. Please see this post from mysqlperformanceblog and learn about the other contributing factors.
Please remember that most who run OPTIMIZE TABLE
do so sequentially. You could probably script many of them in parallel. Of course, you need to convert to innodb_file_per_table like I mentioned before.
Related Question
- MySQL: sysbench test – InnoDB vs Memory tables
- Mysql – How to optimize InnoDB tables in MySQL
- MySql InnoDB tables are missing after restarting
- Optimize Percona MySQL InnoDB 3.2TB Database
- MySQL – Issues Moving Binary Database Folder with InnoDB Tables
- MySQL InnoDB Insertion Faster Than MyISAM
- Mysql – Optimize Table query causing Fatal error during execution with InnoDB table
Best Answer
OPTIMIZE TABLE performs the following steps internally on a table (mydb.mytable)
Since there is DDL involved, there is no way around queries taking a big performance hit during the operation. Additionally, not performing any optimization would be just as bad.
What you need is to have MySQL Master/Master (aka Circular) Replication set up
You could then try this:
For Servers M1 and M2 and DB VIP pointing to M1
On M2, run the following
The
SET sql_log_bin = 0
would prevent the DDL commands from replicating over the Master.Once those steps are complete, promote the Slave to Master, and demote the Master to Slave (can done by just moving your DB VIP from M1 to M2). You could perform this maintenance every day and production will not feel any effects with the exception of the Master Promotion and Slave Demotion.
You could create the script and run it on M2 like this:
From here, just wait for Seconds_Behind_Master to be 0 on M2, then move the DBVIP from M1 to M2. Now if you know the specific names of the tables you want optimized, you could adjust the query to fetch just those tables.
Give it a Try !!!
CAVEAT
Here is a fair warning: If you have innodb_file_per_table disbaled, every time you run
OPTIMIZE TABLE
orALTER TABLE ... ENGINE=InnoDB;
the ibdata1 file just grows. You would need to cleanup the InnoDB infrastructure to prevent ibdata1 from growing out of control.