I have a table with the following schema:
CREATE TABLE `wordtrend` (
`oid` bigint(20) NOT NULL AUTO_INCREMENT,
`monitorId` bigint(20) DEFAULT NULL,
`nGram` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`nGramWord` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`negatives` bigint(20) DEFAULT NULL,
`neutrals` bigint(20) DEFAULT NULL,
`positives` bigint(20) DEFAULT NULL,
`total` bigint(20) DEFAULT NULL,
`trendCut` datetime DEFAULT NULL,
PRIMARY KEY (`oid`)
) ENGINE=MyISAM AUTO_INCREMENT=358539 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
Is it possible to create an index in order to run the following query efficiently?
SELECT nGram, nGramWord, SUM(total) AS sTotal, SUM(positives), SUM(negatives), SUM(neutrals) FROM WordTrend WHERE monitorId = 21751021 AND trendCut >= '2011-01-01 00:00:00' GROUP BY nGram, nGramWord ORDER BY sTotal DESC
We already tried the following:
KEY `RollupIndex` (`monitorId`,`trendCut`)
KEY `RollupIndex2` (`monitorId`,`nGram`,`trendCut`)
but we are getting "Using where; Using temporary; Using filesort" on the extra column.
Best Answer
I noticed something interesting about your query: you are getting these totals for this year. The trendCut date is not a static value. It is a very bad candidate for an index column since the column value changes and it is neither first nor last in the column order of the index.
The index you need is this
monitorID,trendCut,nGram,nGramWord,total
However, this would still be deficient.
Suggestion: If you do queries that require trendCut by year, you should have a trendCutYear column. Try the following:
Here are those steps:
Now, you have a static trendCutYear. You can now query using a static value in the middle of the index. The query can now be rewritten using trendCutYear instead of trendCut:
This should perform a whole lot better for you.
Give it a Try !!!