MySQL: logging queries which would execute without using indexes

index-tuningMySQL

I am trying to use log_queries_not_using_indexes = 1 to find queries which are not executing optimally on a MySQL server. However, I find the resulting log file of rather limited value. Apparently, queries are logged whenever the optimizer really decided not to use an index as a criterion in a WHERE clause. And not if they truly have no indexes matching the filtered columns.

So given a table with the following structure

CREATE TABLE `test` (
    `id_test`   int(11) NOT NULL AUTO_INCREMENT,
    `some_text`  varchar(255) DEFAULT NULL,
    `some_more_text` text,
 PRIMARY KEY (`id_test`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 

a query SELECT id_test from test where id_test != 69 would be logged to the slow log because of not using indexes (the optimizer has decided that a table scan is more efficient as not much could be won by using an index) but SELECT id_test from test where id_test = 69 would not.

I would expect the behavior of the latter query in the first case as well since the index is present. As is, it makes troubleshooting missing indexes rather tiresome. Ideas on how to approach this greatly appreciated.

Best Answer

I'm not really sure why you expected log_queries_not_using_indexes=1 to not log queries that didn't use an index. The idea of the setting is to see which queries are getting run on your server that could be ideal candidates for optimization. For your example:

SELECT id_test from test where id_test != 69

Unless you change the requirements of the query, there's not much you can do. Of course it's faster to do a full scan and return all rows that don't match that id_test value.

As for ideas on how to approach analysis:

  • You can set min_examined_row_limit to a higher value. If, for example, you want to exclude queries that examine less than 5,000 rows, set this variable to 5000. This is useful to reduce the 'noise' in the slow query log.

  • You can use a tool called pt-query-digest with the --review option to review your slow queries. The tool allows you to store your review history so you don't repeat work on the same query during analysis:

    It won’t show you queries you’ve already reviewed. A query is considered to be already reviewed if you’ve set a value for the reviewed_by column. (If you want to see queries you’ve already reviewed, use the --report-all option.)

    It also has the added benefit of storing a history of the queries that get logged in your slow query. Who wouldn't love historical evidence of improving your queries?