Complicated joins can be a precursor to temp table usage. By default, tmp_table_size is set to whatever the value max_heap_table_size is (16M).
For any given query, once a tmp table reaches 16M in RAM, it performs the following migration
- the query suspends operation
- mysqld will migrate the entire 16M of tmp table data onto disk as a MyISAM table
- the query continues until completion
You must also keep in mind where the disk-based tmp tables will land. By default, the folder where disk-based tmp tables will land in defined in tmpdir.
- For Windows, this would be
C:\TEMP
or C:\TMP
(environment variable TEMP
or TMP
)
- For Linux, this would be /tmp
If the mysqld is competing for tmpdir usage with the OS, your should feel it rather quickly because the OS would start swapping to find the elbow room in needs in the system temp folder while the query in question might be trying to force a tmp table into the same OS tmp space.
You may want to look into one or more of the following
- mounting /tmp to a RAM disk
- mounting /tmp to another disk away fromthe root partition
- setting tmp_table_size and max_heap_table_size to 2K (that's not a typo, I SAID 2K) to force tmp tables to goto tmpdir quickly, thus shortening tmp table memory-to-disk migration time
- I wrote about this in StackOverflow
UPDATE 2012-03-13 15:11 EDT
Setting tmp_table_size and max_heap_table_size to 512M is dangerously high. In the event of bad queries (especially the Cross Join (aka Caretsian Join)), mysqld will spend time migrating a 512M MEMORY table to a disk-based MyISAM table before the query continues processing. You may need to lower that number significantly to force tmp tables into disk faster. This is imperative since tmp_table_size and max_heap_table_size are set per DB Connection.
Not a DBA or MySQL expert here, but let's try :).
So let's take your second query - a bit smaller than the 1st one - and simplify the table names.
We have something like : (LO = logs, CL = cloud_logs, CAV = client_application_versions, CA = client_applications)
SELECT LO.* FROM LO
INNER JOIN CL ON CL.id = LO.cloud_log_id
INNER JOIN CAV ON CAV.id = CL.client_application_version_id
INNER JOIN CA ON CA.id = CAV.client_application_id
WHERE (LO.deleted_at IS NULL)
AND (CA.account_id = '3')
AND (CA.id = '5')
ORDER BY timestamp DESC LIMIT 100 OFFSET 0
And so you say it takes about 100 seconds, correct ?
When you say :
I have indexes on all applicable fields already.
Yet I believe that's where the flaw is. You don't have that much joins, and you may have 7 billion data or just 700, that should be performing well if indexing is correctly thought, and I think that's probably the order by / limit that is messing with your performance because of poor indexing.
1/ Have you tried :
SELECT LO.* FROM LO WHERE (LO.deleted_at IS NULL)
or
SELECT * FROM CA WHERE (CA.account_id = '3') AND (CA.id = '5')
See how these requests perform in time, if everything ok with these 2 tables ?
2/ Have you indexed timestamp as well ? Indexing the column you are making the "order by" on is crucial as well.
In fact, you should even think about your data and how many values you're gonna have for each of the data you're querying on. This is very well explained right there : http://www.mysqlperformanceblog.com/2006/09/01/order-by-limit-performance-optimization/ and will certainly help you.
3/ From what I've read on MySQL a few mins ago you could also try a MySQLCheck see if everything ok with your tables if you think your indexing is OK http://dev.mysql.com/doc/refman/5.0/en/mysqlcheck.html . I know that in older versions of oracle we had to compute stats after creating indexes, maybe something similar here ?
Hope this helps.
[EDIT : 12/01/13 After comments ]
Ok, glad to see you already divided the time by 4 but indeed 25s is way too long.
1/ Have you tried to play with indexes by creating one that would make sense, like explained by Peter here (http://www.mysqlperformanceblog.com/2006/09/01/order-by-limit-performance-optimization/) ? Like an index on (CA.account_id, CA.id, timestamp) etc ?
2/ How long does it take when you get rid of the order by / limit like below ?
SELECT LO.* FROM LO
INNER JOIN CL ON CL.id = LO.cloud_log_id
INNER JOIN CAV ON CAV.id = CL.client_application_version_id
INNER JOIN CA ON CA.id = CAV.client_application_id
WHERE (LO.deleted_at IS NULL)
AND (CA.account_id = '3')
AND (CA.id = '5')
To check if this would be the order by/limit that mess up with your performance ?
3/ In case 2 is verified, you could try something like :
SELECT LO.* FROM LO
INNER JOIN CL ON CL.id = LO.cloud_log_id
INNER JOIN CAV ON CAV.id = CL.client_application_version_id
INNER JOIN CA ON CA.id = CAV.client_application_id
INNER JOIN
(
SELECT LO.id FROM LO
INNER JOIN CL ON CL.id = LO.cloud_log_id
INNER JOIN CAV ON CAV.id = CL.client_application_version_id
INNER JOIN CA ON CA.id = CAV.client_application_id
WHERE (LO.deleted_at IS NULL)
AND (CA.account_id = '3')
AND (CA.id = '5')
ORDER BY timestamp DESC LIMIT 0,100
) AS PERF ON PERF.id = LO.id
Where you replace LO.id by the column that makes sense with Logs (I suppose you have some sort of Logs id . This is based on : http://explainextended.com/2009/10/23/mysql-order-by-limit-performance-late-row-lookups/
Note you can change the LIMIT 0,100 and keep the OFFSET keyword instead in case you need it (if PostgreSQL compatibility is required).
Best Answer
It's look like your reporting query lock for a while the tables needed by general queries.
There is no general recipe for such cases and you have to refactor your reporting to avoid locking. Sometimes you can achieve this with splitting one big query into the series of smaller ones. Sometimes the good approach is to copy some tables for exclusive use by report generator. Sometimes it is something else. All depends on the your DB scheme.