Mysql – Large long-running reporting queries are causing very slow new thesql connections

connectionsinnodbMySQLperformance

Usually, mysql's connecton time is in the 0.001 second range. However, when I run a large reporting query (that spans multiple tables with aggregates), I get tons of warnings saying that new mysql connections are taking between .5 and 2 seconds just to connect! Why would a single large query affect connection times so much?

Doing research online, I can't seem to find which mysql settings would need to be tweaked to mitigate the issue.

I'm on MySQL 5.6, using all InnoDB tables.

Best Answer

It's look like your reporting query lock for a while the tables needed by general queries.

There is no general recipe for such cases and you have to refactor your reporting to avoid locking. Sometimes you can achieve this with splitting one big query into the series of smaller ones. Sometimes the good approach is to copy some tables for exclusive use by report generator. Sometimes it is something else. All depends on the your DB scheme.

UPDATE 2012-03-13 15:11 EDT

Setting tmp_table_size and max_heap_table_size to 512M is dangerously high. In the event of bad queries (especially the Cross Join (aka Caretsian Join)), mysqld will spend time migrating a 512M MEMORY table to a disk-based MyISAM table before the query continues processing. You may need to lower that number significantly to force tmp tables into disk faster. This is imperative since tmp_table_size and max_heap_table_size are set per DB Connection.

Mysql – Very Slow MySQL Queries, even with indexes

Not a DBA or MySQL expert here, but let's try :). So let's take your second query - a bit smaller than the 1st one - and simplify the table names.

We have something like : (LO = logs, CL = cloud_logs, CAV = client_application_versions, CA = client_applications)

 SELECT LO.* FROM LO 
 INNER JOIN CL      ON CL.id    = LO.cloud_log_id 
 INNER JOIN CAV     ON CAV.id   = CL.client_application_version_id 
 INNER JOIN CA      ON CA.id    = CAV.client_application_id 
 WHERE (LO.deleted_at IS NULL) 
 AND (CA.account_id = '3') 
 AND (CA.id = '5') 
 ORDER BY timestamp DESC LIMIT 100 OFFSET 0

And so you say it takes about 100 seconds, correct ?

When you say :

I have indexes on all applicable fields already.

Yet I believe that's where the flaw is. You don't have that much joins, and you may have 7 billion data or just 700, that should be performing well if indexing is correctly thought, and I think that's probably the order by / limit that is messing with your performance because of poor indexing.

1/ Have you tried :

SELECT LO.* FROM LO WHERE (LO.deleted_at IS NULL)

SELECT * FROM CA WHERE (CA.account_id = '3') AND (CA.id = '5')

See how these requests perform in time, if everything ok with these 2 tables ?

2/ Have you indexed timestamp as well ? Indexing the column you are making the "order by" on is crucial as well. In fact, you should even think about your data and how many values you're gonna have for each of the data you're querying on. This is very well explained right there : http://www.mysqlperformanceblog.com/2006/09/01/order-by-limit-performance-optimization/ and will certainly help you.

3/ From what I've read on MySQL a few mins ago you could also try a MySQLCheck see if everything ok with your tables if you think your indexing is OK http://dev.mysql.com/doc/refman/5.0/en/mysqlcheck.html . I know that in older versions of oracle we had to compute stats after creating indexes, maybe something similar here ?

Hope this helps.

[EDIT : 12/01/13 After comments ]

Ok, glad to see you already divided the time by 4 but indeed 25s is way too long.

1/ Have you tried to play with indexes by creating one that would make sense, like explained by Peter here (http://www.mysqlperformanceblog.com/2006/09/01/order-by-limit-performance-optimization/) ? Like an index on (CA.account_id, CA.id, timestamp) etc ?

2/ How long does it take when you get rid of the order by / limit like below ?

SELECT LO.* FROM LO 
 INNER JOIN CL      ON CL.id    = LO.cloud_log_id 
 INNER JOIN CAV     ON CAV.id   = CL.client_application_version_id 
 INNER JOIN CA      ON CA.id    = CAV.client_application_id 
 WHERE (LO.deleted_at IS NULL) 
 AND (CA.account_id = '3') 
 AND (CA.id = '5')

To check if this would be the order by/limit that mess up with your performance ?

3/ In case 2 is verified, you could try something like :

SELECT LO.* FROM LO 
 INNER JOIN CL      ON CL.id    = LO.cloud_log_id 
 INNER JOIN CAV     ON CAV.id   = CL.client_application_version_id 
 INNER JOIN CA      ON CA.id    = CAV.client_application_id 
 INNER JOIN 
 (
    SELECT LO.id FROM LO 
    INNER JOIN CL      ON CL.id    = LO.cloud_log_id 
    INNER JOIN CAV     ON CAV.id   = CL.client_application_version_id 
    INNER JOIN CA      ON CA.id    = CAV.client_application_id 
    WHERE (LO.deleted_at IS NULL) 
    AND (CA.account_id = '3') 
    AND (CA.id = '5') 
    ORDER BY timestamp DESC LIMIT 0,100
  ) AS PERF  ON PERF.id = LO.id

Where you replace LO.id by the column that makes sense with Logs (I suppose you have some sort of Logs id . This is based on : http://explainextended.com/2009/10/23/mysql-order-by-limit-performance-late-row-lookups/ Note you can change the LIMIT 0,100 and keep the OFFSET keyword instead in case you need it (if PostgreSQL compatibility is required).

Best Answer

Related Solutions

Mysql – Just how much is a query supposed to slow down the whole system

UPDATE 2012-03-13 15:11 EDT

Mysql – Very Slow MySQL Queries, even with indexes

Related Question