Mysql – Monstrous query hanging on “copying to tmp table” in MySQL

innodbMySQL

I have a huge query being run by one of my developers which seems to hang no matter which performance tunables I adjust. Though I'm somewhat of novice at writing SQL queries, at first glance the query seems like it could use a lot of work. I've already set tmpdir to a RAM disk, but this doesn't seem to help. It would also appear that even though the mysql process state is in "copying to tmp table", nothing is actually being written there. This is on an Intel Xeon L5520 (8-core machine w/HT) with 72GB RAM.

Here is the query:

http://pastebin.com/6mGUtncc

Here is an explain statement run on that query:

id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       PRIMARY <derived2>      ALL     NULL    NULL    NULL    NULL    397520  Using temporary; Using filesort
2       DERIVED <derived3>      ALL     NULL    NULL    NULL    NULL    397520  Using temporary; Using filesort
2       DERIVED o       eq_ref  PRIMARY,ndx_local_order_date,ndx_merchant_id    PRIMARY 4       order_stats.master_ticket_id    1       Using where
2       DERIVED m1      eq_ref  PRIMARY,fk_merchant3    PRIMARY 4       ticketing_brevard_prod.o.merchant_id    1       Using where
2       DERIVED oi      ref     order_items1    order_items1    4       order_stats.master_ticket_id    4
2       DERIVED th      ref     PRIMARY,ticket_header2  ticket_header2  4       order_stats.master_ticket_id    2       Using where
3       DERIVED m       index_merge     PRIMARY,fk_merchant3    PRIMARY,fk_merchant3    4,5     NULL    2       Using union(PRIMARY,fk_merchant3); Using where; Using index; Using temporary; Using filesort
3       DERIVED o2      ref     PRIMARY,ndx_order_date,ndx_merchant_id  ndx_merchant_id 4       ticketing_brevard_prod.m.merchant_id    40603   Using where
3       DERIVED th2     ref     ticket_header2,ticket_header3,fk_ticket_header7 ticket_header2  4       ticketing_brevard_prod.o2.master_ticket_id      2
3       DERIVED customer_type   eq_ref  PRIMARY PRIMARY 4       ticketing_brevard_prod.th2.customer_type        1
3       DERIVED package_header  eq_ref  PRIMARY PRIMARY 4       ticketing_brevard_prod.th2.package_id   1
4       DEPENDENT SUBQUERY      mpc     eq_ref  PRIMARY,merchant_package_characs2,merchant_package_characs3     PRIMARY 60      ticketing_brevard_prod.th2.merchant_id,ticketing_brevard_prod.th2.package_id    1       Using where

I think it's clear to see from the EXPLAIN output that copying 397520 rows of data to tmp is the likely culprit, however I'm not sure what to suggest to the dev who wrote query on how to improve it. It'd be great if someone could point me in the right direction. Thanks in advance!

SHOW VARIABLES LIKE '%buffer%';

Variable_name   Value
bulk_insert_buffer_size 8388608
innodb_buffer_pool_shm_checksum ON
innodb_buffer_pool_shm_key      0
innodb_buffer_pool_size 60129542144
innodb_change_buffering inserts
innodb_log_buffer_size  4194304
join_buffer_size        16777216
key_buffer_size 67108864
myisam_sort_buffer_size 134217728
net_buffer_length       16384
preload_buffer_size     32768
read_buffer_size        16777216
read_rnd_buffer_size    33554432
sort_buffer_size        1073741824
sql_buffer_result       OFF

Best Answer

Wow, that is a doozie of a query. At this point it has little to do with your configuration variables and you should focus on how to optimize the query.

It is tough to give you a magic bullet, though, without knowing what question you are trying to determine from this monstrous query. At first glance, it appears it is trying to do too much.

I would start by removing these two problems from your EXPLAIN plan:

1       PRIMARY <derived2>      ALL     NULL    NULL    NULL    NULL    397520  Using temporary; Using filesort
2       DERIVED <derived3>      ALL     NULL    NULL    NULL    NULL    397520  Using temporary; Using filesort

The first I think is the query for 'order_stats' derived table, the second I am assuming is the 'main' derived table, though it might be flipped. Regardless, I would start at the 'order_stats' (leave a comment with the pastebin of the explain and running time of non-explain, please).

Depending on what this query is trying to accomplish, I would recommend storing the results in a temporary table to be used in the join. One question I would like to know from your developer is, what is the english of the package_tickets and order_tickets computed columns. And if it could be accomplished another way if it turns out to be the bottleneck.