region_id
int(4) NOT NULL,
Learn about TINYINT UNSIGNED (1 byte) and SMALLINT UNSIGNED (2 bytes). INT is always 4 bytes. Smaller -> more cacheable -> faster.
WHERE traffic.region_Id IN (0,1,2,3,4,5,6) AND traffic.category_id =533 AND
traffic.time_index BETWEEN 2012050100 AND 2012080100 AND
traffic.partner_id IN (-999,-1) AND traffic.entity_id > 0 AND
traffic.country_code >=0 AND traffic.metric IN (1,2,3,4,5)
For an index to be useful, it should start with "=" fields and "IN" fields. After that, only one "range" field will be used. So:
INDEX(category_id, region_Id, Partner_id, xx) -- the order of the first 3 is not critical; start xx with whatever is usually most selective (other than time_index - see below).
Generally, when PARTITIONing, it is best to put the partition key (time_index) last in any indexes it is in. The range key is used for "partition pruning" before the INDEX is used. So, even if is too late in the index, that's ok.
Use EXPLAIN PARTITIONS SELECT ... to see that it is doing the "pruning" correctly.
With the disclaimer that I'm not specifically a MySQL expert:
My guess is that either it's failing to whittle down the huge number of rows that result from such as huge join, or it can't index the title well because the LIKE starts with a wildcard.
Testing the second is easy, see if it runs better without a leading wildcard. If this is the case we can look for solutions.
For the first (and potentially the second as well), I would suggest breaking it up by doing a smaller join into a temporary table, then joining that with the other tables.
For example, if we can assume that there are foreign rows in all tables for a given track, we can SELECT, ORDER and LIMIT based on track_title alone into a temporary table. Then join in all the other tables.
Best Answer
Since only the estimated plan is present, these are going to be some guesses.
This part
Creates a filter far in the execution plan:
on this part:
You could try adding a union to split them up, but due to the amount of joins I would try using a temp table to split some logic in two pieces.
I don't know your datatypes so I am going to use a
SELECT ... INTO
. Consider changing this to aINSERT ... SELECT
.Temp table insert:
You could then change the query to this
Apart from this your functions are also going to hurt, depending on how much data is returned.
You could add this index:
The rest of the indexing is hard to say with an estimated plan, there are index posbilities such as:
But I think that getting the query to apply predicates earlier and/or splitting up the work will have bigger results at the moment.
EDIT
You could try changing the scalar functions to APPLY's. Since you have many apply's already you could add the first function as two apply's on the temp table insert:
EDIT 2
Most of the execution time is on the second part, what happens when you write the second part of the query like this?