MySQL performance tuning + queries stuck on “Copying to tmp table”

MySQLperformance

The latter part of the question's title (queries stuck on "Copying to tmp table") has been addressed many times, and I have spent a fair amount of time researching this. I would appreciate it if you guys could help me come to a conclusion – especially with my particular server set up in mind.

Quick overview of server:
– Dedicated server with 2 cores and 64 GB RAM
– Only runs MySQL

The setup is in no way tweaked, so current config is to some extent way off. Hopefully your scolding could provide knowledge.

The web application running on the server is a Magento site with 25.000+ products. The query giving the most headache is the one generating sitemaps.

Currently, the following query has been stuck for a little over an hour on "Copying to tmp table":

Note: I do not really need input on how to increase performance by
optimizing this query, I would rather see how much I can shave query
time off the queries that already exist.

SELECT DISTINCT `e`.`entity_id`, `ur`.`request_path` AS `url`, `stk`.`is_in_stock` FROM `catalog_product_entity` AS `e`
INNER JOIN `catalog_product_website` AS `w` ON e.entity_id=w.product_id
LEFT JOIN `core_url_rewrite` AS `ur` ON e.entity_id=ur.product_id AND ur.category_id IS NULL AND ur.store_id='1' AND ur.is_system=1
INNER JOIN `catalog_category_product_index` AS `cat_index` ON e.entity_id=cat_index.product_id AND cat_index.store_id='1' AND cat_index.category_id in ('2', '3', '68', '86', '145', '163', '182', '196', '198', '214', '249', '252', '285', '286', '288', '289', '290', '292', '549') AND cat_index.position!=0
INNER JOIN `cataloginventory_stock_item` AS `stk` ON e.entity_id=stk.product_id AND stk.is_in_stock=1
INNER JOIN `catalog_product_entity_int` AS `t1_visibility` ON e.entity_id=t1_visibility.entity_id AND t1_visibility.store_id=0
LEFT JOIN `catalog_product_entity_int` AS `t2_visibility` ON t1_visibility.entity_id = t2_visibility.entity_id AND t1_visibility.attribute_id = t2_visibility.attribute_id AND t2_visibility.store_id='1'
INNER JOIN `catalog_product_entity_int` AS `t1_status` ON e.entity_id=t1_status.entity_id AND t1_status.store_id=0
LEFT JOIN `catalog_product_entity_int` AS `t2_status` ON t1_status.entity_id = t2_status.entity_id AND t1_status.attribute_id = t2_status.attribute_id AND t2_status.store_id='1' WHERE (w.website_id='1') AND (t1_visibility.attribute_id='102') AND ((IF(t2_visibility.value_id > 0, t2_visibility.value, t1_visibility.value)) IN(3, 2, 4)) AND (t1_status.attribute_id='96') AND ((IF(t2_status.value_id > 0, t2_status.value, t1_status.value)) IN(1))  

Relevant config:

Server buffers:

max_connections = 1500;  
key_buffer_size = 22G;  
innodb_buffer_pool_size = 16G;  
innodb_additional_mem_pool_size = 2G;  
innodb_log_buffer_size = 400M;  
query_cache_size = 64M;  

Per thread buffers:

read_buffer_size = 2M;  
read_rnd_buffer_size = 16M;  
sort_buffer_size = 128M;  
thread_stack = 192K;  
join_buffer_size = 8M;  

Question: Does any of these variables seem way off to any of you?

The above memory limits will allow me to use 130% more RAM than is physically installed on my system. Obviously, something must change.

According to the MySQLTuner Perl script, I am currently in danger of using:
Total buffers: 40.7G global + 28.2M per thread (1500 max threads)
Maximum possible memory usage: 82.0G (130% of installed RAM)

Questions: What would give the most performance increase of the above variables, or: will it be more useful to increase the total server buffer limit or the per thread buffer limits?

The two variables that seem to affect "Copying to tmp table" the most are:
– tmp_table_size
– max_heap_table_size

Mine are both set to 0.25G

Question: Any recommendations specifically for those to variables?

There are two suggested fixes that seem to get mentioned more that others:
– Kill process. Do flush table.
– Kill process. Do repair / optimize tables.

Question: Which of the two above suggested solution do you think is the most viable?

Best Answer

This query definitely needs to be optimized since the "DISTINCT" part is forcing creation of temporary tables.

If you really don't want to touch the query (which is a mistake) - then your other option would be to create a RAM disk (tmpfs) for your tmp files. The query will still be creating temp tables, however it will avoid the disk I/O part.

Hope this helps.