I am running on MariaDB 10.0.15 on a 4GB RAM cloud server with Local SSD disk for database.
My model:
users <n---n> right_holders <n---n> tracks ---n> raw_detections
I have this table raw_detections
with more than 100 million records. I want to sum all raw_detections
for a specific user in a specific period. This is the query that I am having problem:
SELECT
DATE(raw_detection.created_at),
SUM(1)
FROM `raw_detection`
INNER JOIN `audio_sources` ON `audio_sources`.`id` = `raw_detection`.`audio_source_id`
INNER JOIN `cities` ON `cities`.`id` = `audio_sources`.`city_id`
INNER JOIN `tracks` ON `raw_detection`.`track_id` = `tracks`.`id`
INNER JOIN `track_right_holders` ON `tracks`.`id` = `track_right_holders`.`track_id`
INNER JOIN `right_holders` ON `track_right_holders`.`right_holder_id` = `right_holders`.`id`
INNER JOIN `user_right_holders` ON `right_holders`.`id` = `user_right_holders`.`right_holder_id`
WHERE `raw_detection`.`duplicated` = 0
AND `user_right_holders`.`user_id` = 1
AND (raw_detection.created_at >= '2015-01-18 00:00:00')
AND (raw_detection.created_at <= '2015-02-19 23:59:59')
AND (audio_sources.source_type = 'Radio')
AND (track_right_holders.role = 'Interpreter')
GROUP BY DATE(raw_detection.created_at);
There are some users that have a lot of detections (tens of thousands) per month.
When raw_detection table is MyISAM, this query takes 2-3s to run. When the table is InnoDB, it takes 10-20s to run. My questions are:
1) Why this is happening?
2) How can I improve this performance for both MyISAM and InnoDB
raw_detection
schema:
CREATE TABLE `raw_detection` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`score` int(11) NOT NULL,
`track_id` int(11) NOT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`audio_source_id` int(11) NOT NULL,
`duplicated` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `index_raw_detection_on_duplicated_and_created_at` (`duplicated`,`created_at`),
KEY `index_raw_detection_on_created_at` (`created_at`),
KEY `index_raw_detections_audio_source` (`audio_source_id`,`duplicated`,`created_at`),
KEY `index_raw_detection_on_track_id_and_duplicated_and_created_at` (`track_id`,`duplicated`,`created_at`)
) ENGINE=InnoDB AUTO_INCREMENT=126224926 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Best Answer
In your case, your problem is RAM. Why?
In my old post What are the main differences between InnoDB and MyISAM?, I discuss how each Storage Engine does caching.
Here is a picture from Vadim Tkechenko that shows what InnoDB looks like:
InnoDB has more housecleaning to do against the buffer pool as new data/index pages come in and old data/index pages get purged. MyISAM does not have any housecleaning of data to do.
You will need a Cloud Server with more RAM and a larger InnoDB Buffer Pool.
UPDATE 2015-03-20 18:00 EDT
If you can give more memory to the Buffer Pool without getting a bigger Cloud Server, that's fine.
If your Cloud Server is a dedicated DB Server, you should do 75% of RAM. So, that's 3G for innodb_buffer_pool_size. If your Cloud Server is a LAMP stack, perhaps 60% of RAM, which would be 2457M.
If you have lots of data, then I would suggest a Cloud Server with more RAM.
UPDATE 2015-03-20 18:06 EDT
I also noted that you said
If this means the table has lots on INSERTS, you should run this once a week
Then, try the query again. The index statistics might be out of sync with the amount of data currently in the table.