Mysql – Why does MySQL (InnoDB) table get faster after OPTIMIZE TABLE, but then don’t work

indexinnodbMySQLperformance

I have a Django web application that stores data in a MySQL InnoDB database. There is a particular page that is accessed a lot on the django admin, and the query is taking a long time (~20 seconds). Since it's the Django internals, the query cannot be changed.

There are 3 tables, A, B, and C. The query looks like:

SELECT * 
FROM A 
   INNER JOIN B ON (A.b_id = B.foo) 
   INNER JOIN C ON (B.foo = C.id) 
ORDER BY A.id DESC 
LIMIT 100

A simple join-3-tables together.

The id fields are primary keys and have indexes. A.b_id, B.foo both have their own indexes.

However the query plan looks wrong and says it's not using any keys on B (but it is using the keys for the other joins). From reading lots of MySQL performance stuff it should in theory be using indexes, since it's various const joins that can 'fall through'. It says it has to scan all ~1,200 rows of B.

The weird thing is that I OPTIMIZEed each one on my local machine and re-ran the queries (with SQL_NO_CACHE) and it was much faster, 0.02sec vs. the original 20sec. EXPLAIN on the same query gave a different, and much more sensible result, showing that it can use an index on each one, and that it doesn't have to scan the whole lot. A co-worker ran OPTIMIZE for each one on a testing machine with approximately the same data (which was recently recreated from a loaded dump file) and it also showed a speed increase, and a sensible-explain.

So we ran that on the live system… and it didn't change anything (the speed nor explain). I recreated my MySQL database (DROPed the database and reloaded from a dump), and now the OPTIMIZE doesn't change anything (i.e. ~20sec run time, bad query plan).

Why does this happen? How can I get MySQL to use the correct indexes and get back my 0.02s query time? This blog post ( http://www.xaprb.com/blog/2010/02/07/how-often-should-you-use-optimize-table/ ) implies that OPTIMIZE only really optimizes the primary key (not that b_id, foo are not primary indexes). How can I "rebuild secondary index"? I tried doing ALTER TABLE A ENGINE=InnoDB (and the same for B & C) and there was no change.

Actually the more I look at this, the more it seems to be a MySQL query plan fail. It's doing the wrong query plan and not using indexes that it could use. After doing an OPTIMIZE TABLE etc., it is sometimes able to use the correct indexes. When there's a bad query plan like this, it can be sorta randomly selected, that's why (I think) different machines have different results.

Best Answer

OPTIMIZE rebuilds the table. This (for InnoDB) squeezes out some of the fragmentation and wasted space. This is unlikely to make a noticeable difference in any query.

Also, OPTIMIZE does an ANALYZE. This has a chance of changing the statistics, thereby leading to a different (better or worse) EXPLAIN plan.

Since ANALYZE is much faster (on InnoDB) than OPTIMIZE, just do the ANALYZE.

Various non-ANALYZE actions cause an ANALYZE to be done.

ANALYZE randomly probes the BTrees, gathering stats. Sometimes the resulting stats are poor. There is effectively no way to prevent this from happening. Several partial hacks have been created over the years; 5.6.7 gets close to eliminating this problem with ANALYZE. Here's one of them: http://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_stats_persistent_sample_pages