I don't see a lot of opportunity for improvement.
The index you added was probably a big help, because it's being used for the range matching on the WHERE clause (type => range, key => tran_date), and it's being used as a covering index (extra => using index), avoiding the need to seek into the table to fetch the row data.
But since you're using functions to construct the financial_year value for the group by, both the "using filesort" and "using temporary" can't be avoided. But, those aren't the real problem. The real problem is that you're evaluating MONTH(tran_date) 346,485 times and YEAR(tran_date) at least that many times... ~700,000 function calls in one second doesn't seem too bad.
Plan B: I am definitely not a fan of storing redundant data, and I'm dead-set against making the application responsible for maintaining it... but one option I might be tempted to try would be to create a dashboard_stats_by_financial_year table, and use after-insert/update/delete triggers on the transactions1 table to manage keeping those stats current.
That option has a cost, of course -- adding to the amount of time it takes to update/insert/delete a transaction... but, waiting > 1200 milliseconds for stats for your dashboard is a cost, too. So it may come down to whether you want to pay for it now or pay for it later.
Using FULLTEXT indexes has to be handled with great care. Why ? While FULLTEXT index searches do work, the MySQL Query optimizer tends to suggest full table scans if you do not express the query properly.
Let's take your query and look for 'tom'
SELECT DISTINCT c.movieName, c.castName, c.movieImdbId, f.year, f.posterLink
FROM cast_movie as c JOIN film_info as f ON c.ImdbId = f.ImdbId
WHERE c.castName LIKE '%tom%';
This joins everything. Then, it scans the resulting temp table and returns every row with tom
.
Let's implement the FULLTEXT method
SELECT DISTINCT c.movieName, c.castName, c.movieImdbId, f.year, f.posterLink
FROM cast_movie as c JOIN film_info as f ON c.ImdbId = f.ImdbId
WHERE MATCH(c.castName) AGAINST ('+tom' IN BOOLEAN MODE);
I can almost assure you that a full table scan will occur during the join because I have experienced it hundreds of times:
I started writing posts about FULLTEXT and query refactoring ever since I saw a post from webmasterworld.com entitled MySQL Match Against + a very expensive join. How do I optimize this?
Given the information I just mentioned let's take your query and refactor it to handle the FULLTEXT and stilldo a good join.
SELECT DISTINCT c.movieName, c.castName, c.movieImdbId, f.year, f.posterLink
FROM cast_movie as c JOIN film_info as f ON c.ImdbId = f.ImdbId
WHERE MATCH(c.castName) AGAINST ('+tom' IN BOOLEAN MODE);
First, apply the MATCH ... AGAINST
operator to cast_movie
and return ImdbId
and other fields
SELECT ImdbId,movieName,castName,movieImdbId FROM cast_movie
WHERE MATCH(c.castName) AGAINST ('+tom' IN BOOLEAN MODE)
Make that a subquery to provide keys to join and retrieve data from film_info
SELECT DISTINCT c.movieName, c.castName, c.movieImdbId, f.year, f.posterLink FROM
(
SELECT ImdbId,movieName,castName,movieImdbId FROM cast_movie
WHERE MATCH(c.castName) AGAINST ('+tom' IN BOOLEAN MODE)
) as c JOIN film_info as f ON c.ImdbId = f.ImdbId;
This should produce a better result.
If your tables use InnoDB Storage Engine, remember to tune the InnoDB Fulltext options. Any fulltext option that does not start with innodb_
is meant for MyISAM (See my post MySQL FullText search on string shorter than 3 chars returns no rows)
Give it a Try !!!
Best Answer
I'll approach it with a new DB design that suits the purpose.
So the Movie entity has a n:m relation with the Person entity. A person is cast in many movies and a movie has many persons in its cast. The movie entity has an year column and the person entity has a gender column.
So create this database as done at SQLFiddle here. Create two tables by queries as follows.
Then query the two temp tables as so:
You can run the SQL Fiddle snippet and see the result.