Mysql – optimize the query thesql

indexjoin;MySQLoptimizationquery-performance

I have the following query:

SELECT DISTINCT d.movieName, d.castName, d.movieImdbId, f.year, f.posterLink
FROM director_movie as d LEFT JOIN film_info as f ON d.movieImdbId = f.ImdbId
WHERE d.castName LIKE '%castname%'

for some names it works well, but for some other it takes a long time to retrieve results..

film_info is a view containing 100,000 rows.
The structure is: movieName, ImdbId, year, rate, posterLink(link to image of movie)

director_movie: is a table containing 2,259,630 rows.
The structure is: Id(unique), castName, castImdbId, movieName, movieImdbId

The goal of this query join is to retrieve list of movies by each director.
I defined castName and movieImdbId as indexes (in order to speed up joins and read from table), but actually I am a bit confused about how to chose an appropriate index. (If I have to use it in my case..). I read one of the factor to see if index is useful is cardinality which I am also confused with this term.. because it change when I add another index to my table..
I tried different combinations of columns for indexes but none of them speed up my query results..

Could someone kindly explain me what is the best way to improve the performance of my query (in terms of time)?

Thanks in advance

Best Answer

This is your original query

SELECT DISTINCT d.movieName, d.castName, d.movieImdbId, f.year, f.posterLink
FROM director_movie as d LEFT JOIN film_info as f ON d.movieImdbId = f.ImdbId
WHERE d.castName LIKE '%castname%'

There are three things you can do

SUGGESTION #1

You should reorganize the query so that the castName is searched first

SELECT DISTINCT d.movieName, d.castName, d.movieImdbId, f.year, f.posterLink
FROM (SELECT movieName, castName, movieImdbId
FROM director_movie WHERE castName LIKE '%castname%') d
LEFT JOIN film_info as f ON d.movieImdbId = f.ImdbId;

SUGGESTION #2

Run these please

SELECT COUNT(1) WithDistinctCount FROM
(SELECT DISTINCT d.movieName, d.castName, d.movieImdbId, f.year, f.posterLink
FROM (SELECT movieName, castName, movieImdbId
FROM director_movie WHERE castName LIKE '%castname%') d
LEFT JOIN film_info as f ON d.movieImdbId = f.ImdbId) A;
SELECT COUNT(1) WithoutDistinctCount FROM
(SELECT d.movieName, d.castName, d.movieImdbId, f.year, f.posterLink
FROM (SELECT movieName, castName, movieImdbId
FROM director_movie WHERE castName LIKE '%castname%') d
LEFT JOIN film_info as f ON d.movieImdbId = f.ImdbId) A;

If WithDistinctCount and WithoutDistinctCount are the same, then remove DISTINCT. It may actually return the results faster.

SUGGESTION #3 (Optional)

If you implement my FULLTEXT suggestion (Joining two Mysql VIEWs takes a very long time), you could redo subquery d

SELECT DISTINCT d.movieName, d.castName, d.movieImdbId, f.year, f.posterLink
FROM (SELECT movieName, castName, movieImdbId
FROM director_movie WHERE MATCH(castName) AGAINST ('+castname' IN BOOLEAN MODE)) d
LEFT JOIN film_info as f ON d.movieImdbId = f.ImdbId;

This requires you create a FULLTEXT index on director_movie

ALTER TABLE director_movie ADD FULLTEXT castname_ft_ndx (castname);

GIVE IT A TRY !!!