Mysql – How to improve the performance of this MySQL query

indexMySQLoptimization

This query takes a long time to run, is there be a way I could rewrite it to make it faster?

SELECT DISTINCT r.racedate, re.trainerid
FROM raceentry re
INNER JOIN race r ON r.raceid = re.raceid
WHERE re.trainerid IS NOT NULL
AND re.Trainer30in90Starts IS NULL
AND re.trainerid IN (
    SELECT DISTINCT trainerid
    FROM pp_entries
)
AND re.raceid IN (
    SELECT re2.raceid
    FROM raceentry re2
    INNER JOIN race r2 ON r2.raceid = re2.raceid
    WHERE re2.horsename = re.horsename
    ORDER BY r2.racedate DESC
)
ORDER BY r.racedate DESC, re.trainerid LIMIT 500

This is what it displays when I run the EXPLAIN on it:

1   SIMPLE  pp_entries  NULL    range   IX_pp_Entries_1,trainerid               trainerid   5   NULL    4835    46.66   Using where; Using index; Using temporary; Using f...
1   SIMPLE  re          NULL    ref     PRIMARY,IX_RaceEntry,ix_RaceEntry_Horse,RaceEntry2...   RaceEntry_trainerID     5   thefigures_dbo.pp_entries.trainerid     10  10.00   Using where
1   SIMPLE  r           NULL    eq_ref  PRIMARY,IX_Race     PRIMARY     17  thefigures_dbo.re.RaceID    1   100.00  NULL
1   SIMPLE  r2          NULL    eq_ref  PRIMARY,IX_Race     PRIMARY     17  thefigures_dbo.re.RaceID    1   100.00  Distinct
1   SIMPLE  re2         NULL    eq_ref  PRIMARY,IX_RaceEntry,ix_RaceEntry_Horse,RaceEntry2...   PRIMARY     49  thefigures_dbo.re.RaceID,thefigures_dbo.re.HorseNa...   1   100.00  Using index; Distinct

Thanks guys, used a combo of the suggestions and ended up with this query

SELECT DISTINCT r.racedate, re.trainerid
FROM raceentry re
INNER JOIN race r ON r.raceid = re.raceid
WHERE re.trainerid IS NOT NULL
AND re.Trainer30in90Starts IS NULL
AND re.trainerid IN (
    SELECT trainerid
    FROM pp_entries
)
LIMIT 300

The biggest improvement seemed to be removing the 2nd IN() clause suggested by Oreo

Best Answer

Try using EXISTS instead of IN, it tends to take good shortcuts when filtering your main records.

Also, you had a redundant ORDER BY within your IN filter, I hope the MySQL optimiser got rid of that for you, but you might as well remove that from the code since is has no effect on your results.

Actually, looking closer at your last IN filter, it seems pretty redundant, as it's joining receentry to itself, using horsename and raceid?

Try the below:

SELECT DISTINCT r.racedate, re.trainerid
FROM raceentry re
JOIN race r ON r.raceid = re.raceid
WHERE re.trainerid IS NOT NULL
    AND re.Trainer30in90Starts IS NULL
    AND EXISTS (
        SELECT 1
        FROM pp_entries ppe
        WHERE re.trainerid = ppe.trainerid
        )
ORDER BY r.racedate DESC, re.trainerid
LIMIT 500