I would like to get rid of "Using temporary; Using filesort"
One of the problems I see is that you're using different GROUP BY
and ORDER BY
clauses. From the manual on how MySQL uses temporary tables:
If there is an ORDER BY clause and a different GROUP BY clause, or if the ORDER BY or GROUP BY contains columns from tables other than the first table in the join queue, a temporary table is created.
As soon as you create a temporary table, it will need to be sorted according to your ORDER BY
clause, indicated by 'using filesort'.
This execution plan at leasts uses the indexes to appropriately limit the number of rows found.
I would also look through the docs on ORDER BY optimization.
Because the WHERE
condition of the query involves only equality checks:
WHERE "songs"."processed" = 't'
AND "songs"."working" = 't'
and then you have:
SELECT DISTINCT ON (songs.rank, songs.shared_id) ...
which is similar to GROUP BY songs.rank, songs.shared_id
I would first try adding a compound index on (first the columns in WHERE
, then the columns in DISTINCT ON
):
(processed, working, rank, shared_id)
The ordering: ORDER BY rank DESC
may be better optimized if you have the index as:
(processed, working, rank DESC, shared_id)
Not really sure if this would contribute to efficiency but you can test.
Addition by @Erwin
As per request in comment
In principal (default) b-tree indexes can be scanned forward and backward at the same speed. But sorting can make a difference in multi-column indexes where you combine the sort order of multiple columns. The query starts with:
SELECT DISTINCT ON (songs.rank, songs.shared_id)
In combination with ORDER BY rank DESC
this dictates that the result be ordered by rank DESC, shared_id
effectively. After the (simplified) WHERE clause WHERE processed AND working
has been applied and before LIMIT
can be applied.
I have my doubts if the DISTINCT
clause is actually useful. But while it is there, the optimal index for the query should be (just as @ypercube suspected):
CREATE INDEX songs_special_idx
ON songs (processed, working, rank DESC, shared_id);
Looks like one of the rare cases where explicit ordering of index columns would benefit the query. There is an excellent explanation in the chapter Indexes and ORDER BY of the manual.
If the WHERE condition is stable (always WHERE processed AND working
), a partial multi-column index would be smaller and faster, yet:
CREATE INDEX songs_special_idx
ON songs (rank DESC, shared_id)
WHERE processed AND working;
Best Answer
Step 1. Get rid of the
JOIN
.The main purpose of t2 (the way the query is written) is to see if there are amy
message_entries
for the givenmessage
. Do you really need that test? If not, get rid of theJOIN
Step 2. Get rid of the
DISTINCT
.The secondary purpose of the
JOIN
(should you decide to keep it) is to create multiple rows, thereby necessitating theDISTINCT
. But, since it is only testing for the existence of at least one 'entry', toss theJOIN
and add this to theWHERE
:Step 3: Better INDEX (Ok, you already have a good index)
This may allow for getting through the
WHERE
and on to theORDER BY
. Note: TheORDER BY
caused the temp and filesort.Note: A temp and/or filesort is not evil in itself; it is just a sign of either a complex query (where it is necessary) or a missing index or an inefficiently formulated query.
Huh? Since you don't have a
LIMIT
, why do you haveSQL_CALC_FOUND_ROWS
? Count the number of rows in the resultset to get the "found rows".