I have the following tables:
CREATE TABLE `events` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`status` varchar(255) COLLATE utf8_unicode_ci DEFAULT 'draft',
`publish_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_events_on_status` (`status`),
KEY `index_events_on_status_and_publish_at` (`status`,`publish_at`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=32092 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `event_dates` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`start_date` date NOT NULL,
`end_date` date DEFAULT NULL,
`event_id` int(11) NOT NULL,
`venue_id` int(11) DEFAULT NULL,
`alt_venue_id` int(11) DEFAULT NULL,
`allday` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `event_dates_event_id_fk` (`event_id`),
KEY `event_dates_venue_id_fk` (`venue_id`),
KEY `event_dates_alt_venue_id_fk` (`alt_venue_id`),
KEY `index_event_dates_on_start_date` (`start_date`),
KEY `index_event_dates_on_end_date` (`end_date`),
KEY `index_event_dates_on_start_date_and_end_date` (`start_date`,`end_date`),
CONSTRAINT `event_dates_alt_venue_id_fk` FOREIGN KEY (`alt_venue_id`) REFERENCES `venues` (`id`) ON DELETE SET NULL,
CONSTRAINT `event_dates_event_id_fk` FOREIGN KEY (`event_id`) REFERENCES `events` (`id`) ON DELETE CASCADE,
CONSTRAINT `event_dates_venue_id_fk` FOREIGN KEY (`venue_id`) REFERENCES `venues` (`id`) ON DELETE SET NULL
) ENGINE=InnoDB AUTO_INCREMENT=152634 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `event_times` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`time_start` time NOT NULL,
`time_end` time DEFAULT NULL,
`date_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `event_times_date_id_fk` (`date_id`),
CONSTRAINT `event_times_date_id_fk` FOREIGN KEY (`date_id`) REFERENCES `event_dates` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=66391 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `venues` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`city_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `venues_city_id_fk` (`city_id`),
KEY `venues_to_cities` (`city_id`),
CONSTRAINT `venues_to_cities` FOREIGN KEY (`city_id`) REFERENCES `cities` (`id`) ON DELETE SET NULL ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=3083 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Events can happen multiple times (even more than once per day) and in different venues (the table event_dates is used to define this concept).
The basic use-case in my application is to get the list of events that are going to happen after a certain date (e.g. show me all events form today onwards).
I use the following query to get the relevant ids of the matching records.
Better said, this query outputs all events occurrences happening from a given date onwards, where an occurrences is just a event_date (if the event takes place only once on the event_date) or a tuple [event_date, event_time] (if the event takes place multiple times on the same event_date).
SELECT
events.id AS event_id,
CONCAT(event_dates.id, '::', IF(event_times.id IS NULL, '', event_times.id)) AS occurrence_id,
venues.id AS venue_id, alt_venues.id AS alt_venue_id,
event_dates.start_date AS event_dates_start_date, event_dates.allday AS event_dates_allday, event_times.time_start AS event_times_time_start
FROM `event_dates`
INNER JOIN `events` ON `events`.`id` = `event_dates`.`event_id`
LEFT JOIN event_times ON event_times.date_id = event_dates.id
INNER JOIN venues ON event_dates.venue_id = venues.id
LEFT JOIN venues AS alt_venues ON event_dates.alt_venue_id = alt_venues.id
WHERE
(events.status = 'published' AND (events.publish_at IS NULL OR events.publish_at <= NOW()))
AND (1 = 1) AND (1 = 1) AND
(
('2015-06-27' <= event_dates.start_date)
OR
(event_dates.end_date is NOT NULL AND '2015-06-27' <= event_dates.end_date)
)
LIMIT 0, 100000
MySQL takes ~40ms to return 8.4K rows.
The output of EXPLAIN EXTENDED is:
+----+-------------+-------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+---------+-----------------------------------------+------+----------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+---------+-----------------------------------------+------+----------+------------------------------------+
| 1 | SIMPLE | events | ref | PRIMARY,index_events_on_status,index_events_on_status_and_publish_at | index_events_on_status | 768 | const | 7467 | 100.00 | Using index condition; Using where |
| 1 | SIMPLE | event_dates | ref | event_dates_event_id_fk,event_dates_venue_id_fk,index_event_dates_on_start_date,index_event_dates_on_end_date,index_event_dates_on_start_date_and_end_date | event_dates_event_id_fk | 4 | agenda_staging.events.id | 1 | 100.00 | Using where |
| 1 | SIMPLE | venues | eq_ref | PRIMARY | PRIMARY | 4 | agenda_staging.event_dates.venue_id | 1 | 100.00 | Using index |
| 1 | SIMPLE | alt_venues | eq_ref | PRIMARY | PRIMARY | 4 | agenda_staging.event_dates.alt_venue_id | 1 | 100.00 | Using index |
| 1 | SIMPLE | event_times | ref | event_times_date_id_fk | event_times_date_id_fk | 4 | agenda_staging.event_dates.id | 1 | 100.00 | NULL |
+----+-------------+-------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+---------+-----------------------------------------+------+----------+------------------------------------+
I'm not a DB expert at all, but looking at type and extra columns I understand that MySQL is properly using my indexes.
However, in all the perf-related material I found googling, 8K is a very tiny result-set.
Given that the hardware I'm on is a 4-core SSD-based MacBook Pro, I believe that it should be an order of magnitude faster.
40ms alone would not be a problem.
However in the application I need data from various other tables for every event.
I fetch that data from the following query, which uses the one above as a subquery.
SELECT
dative.occurrence_id, dative.event_id, dative.venue_id, dative.alt_venue_id, images.id AS image_id,
GROUP_CONCAT(DISTINCT event_categories_events.event_category_id ORDER BY event_category_id) AS event_category_ids,
COUNT(DISTINCT events_promoters.promoter_id) AS promoters_count,
GROUP_CONCAT(DISTINCT event_texts.id) AS event_text_ids,
GROUP_CONCAT(DISTINCT communications.id ORDER BY communications.id) AS communication_ids,
GROUP_CONCAT(DISTINCT events_tags.tag_id ORDER BY tag_id) AS tag_ids,
GROUP_CONCAT(DISTINCT collection_eds_events.collection_edition_id ORDER BY collection_edition_id) AS collection_edition_ids
FROM (
SELECT
events.id AS event_id,
CONCAT(event_dates.id, '::', IF(event_times.id IS NULL, '', event_times.id)) AS occurrence_id,
venues.id AS venue_id, alt_venues.id AS alt_venue_id
, event_dates.start_date AS event_dates_start_date, event_dates.allday AS event_dates_allday, event_times.time_start AS event_times_time_start
FROM `event_dates`
INNER JOIN `events` ON `events`.`id` = `event_dates`.`event_id`
LEFT JOIN event_times ON event_times.date_id = event_dates.id
INNER JOIN venues ON event_dates.venue_id = venues.id
LEFT JOIN venues AS alt_venues ON event_dates.alt_venue_id = alt_venues.id
WHERE
(events.status = 'published' AND (events.publish_at IS NULL OR events.publish_at <= NOW()))
AND (1 = 1) AND (1 = 1) AND
(
('2015-06-27' <= event_dates.start_date)
OR
(event_dates.end_date is NOT NULL AND '2015-06-27' <= event_dates.end_date)
)
)
AS dative
INNER JOIN events ON events.id = dative.event_id
INNER JOIN `product_eds_events` ON `product_eds_events`.`event_id` = `events`.`id`
INNER JOIN `product_editions` ON `product_editions`.`id` = `product_eds_events`.`product_edition_id`
INNER JOIN `images` ON `images`.`event_id` = `events`.`id`
INNER JOIN event_texts ON event_texts.event_id = events.id AND event_texts.language = 'it'
LEFT JOIN collection_eds_events ON collection_eds_events.event_id = events.id
LEFT JOIN collection_editions ON collection_editions.id = collection_eds_events.collection_edition_id
LEFT JOIN communications ON communications.event_id = events.id AND communications.status = "published" AND communications.publish_at <= NOW() AND communications.expire_at >= CURDATE()
INNER JOIN event_categories_events ON event_categories_events.event_id = events.id
INNER JOIN events_promoters ON events_promoters.event_id = events.id
LEFT JOIN events_tags ON events_tags.event_id = events.id
WHERE `product_editions`.`product_id` = 1
GROUP BY dative.occurrence_id
ORDER BY event_dates_start_date ASC, event_dates_allday ASC, event_times_time_start ASC
LIMIT 0, 100000
This query takes 340ms to return 8.2K rows.
And this is the EXPLAIN EXTENDED output:
+----+-------------+-------------------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+---------+------------------------------------------------------------+------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+---------+------------------------------------------------------------+------+----------+----------------------------------------------+
| 1 | PRIMARY | product_editions | ref | PRIMARY,product_editions_product_id_fk | product_editions_product_id_fk | 5 | const | 4 | 100.00 | Using index; Using temporary; Using filesort |
| 1 | PRIMARY | product_eds_events | ref | index_unique_product_editions_events,product_eds_events_event_id_fk | index_unique_product_editions_events | 5 | agenda_staging.product_editions.id | 1059 | 100.00 | Using where; Using index |
| 1 | PRIMARY | event_texts | ref | index_event_texts_on_event_id_and_language | index_event_texts_on_event_id_and_language | 773 | agenda_staging.product_eds_events.event_id,const | 1 | 100.00 | Using where; Using index |
| 1 | PRIMARY | events | eq_ref | PRIMARY | PRIMARY | 4 | agenda_staging.product_eds_events.event_id | 1 | 100.00 | Using index |
| 1 | PRIMARY | event_categories_events | ref | event_categories_events_event_id_fk | event_categories_events_event_id_fk | 5 | agenda_staging.product_eds_events.event_id | 1 | 100.00 | NULL |
| 1 | PRIMARY | images | ref | images_event_id_fk | images_event_id_fk | 5 | agenda_staging.product_eds_events.event_id | 1 | 100.00 | Using index |
| 1 | PRIMARY | events_promoters | ref | index_events_promoters_on_event_id_and_promoter_id | index_events_promoters_on_event_id_and_promoter_id | 4 | agenda_staging.product_eds_events.event_id | 1 | 100.00 | Using index |
| 1 | PRIMARY | collection_eds_events | ref | collection_eds_events_event_id_fk | collection_eds_events_event_id_fk | 5 | agenda_staging.product_eds_events.event_id | 1 | 100.00 | NULL |
| 1 | PRIMARY | collection_editions | eq_ref | PRIMARY | PRIMARY | 4 | agenda_staging.collection_eds_events.collection_edition_id | 1 | 100.00 | Using index |
| 1 | PRIMARY | communications | ref | communications_event_id_fk | communications_event_id_fk | 5 | agenda_staging.product_eds_events.event_id | 1 | 100.00 | Using where |
| 1 | PRIMARY | events_tags | ref | index_events_tags_on_event_id_and_tag_id | index_events_tags_on_event_id_and_tag_id | 4 | agenda_staging.product_eds_events.event_id | 1 | 100.00 | Using index |
| 1 | PRIMARY | <derived2> | ref | <auto_key0> | <auto_key0> | 4 | agenda_staging.product_eds_events.event_id | 10 | 100.00 | NULL |
| 2 | DERIVED | events | ref | PRIMARY,index_events_on_status,index_events_on_status_and_publish_at | index_events_on_status | 768 | const | 7467 | 100.00 | Using index condition; Using where |
| 2 | DERIVED | event_dates | ref | event_dates_event_id_fk,event_dates_venue_id_fk,index_event_dates_on_start_date,index_event_dates_on_end_date,index_event_dates_on_start_date_and_end_date | event_dates_event_id_fk | 4 | agenda_staging.events.id | 1 | 100.00 | Using where |
| 2 | DERIVED | venues | eq_ref | PRIMARY | PRIMARY | 4 | agenda_staging.event_dates.venue_id | 1 | 100.00 | Using index |
| 2 | DERIVED | alt_venues | eq_ref | PRIMARY | PRIMARY | 4 | agenda_staging.event_dates.alt_venue_id | 1 | 100.00 | Using index |
| 2 | DERIVED | event_times | ref | event_times_date_id_fk | event_times_date_id_fk | 4 | agenda_staging.event_dates.id | 1 | 100.00 | NULL |
+----+-------------+-------------------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+---------+------------------------------------------------------------+------+----------+----------------------------------------------+
Again, it seems to me that the indexes are being used.
Am I missing some obvious issue here?
Or are these timings to be expected?
All tables are InnoDB with indexed foreign keys.
I am sorry for the very long question.
I did not include the DDL for the tables in the second query for brevity's sake.
If they can of help, I will edit my question with a like to a pastebin with the complete DDL.
Please I need help to solve this cause I won't get acceptable response times for my webapp otherwise.
Best Answer
LIMIT 10000
?status
be turned into anENUM
(1 byte) instead of a potentially largeVARCHAR(255)
?NULL
mean forend_time
? Forend_date
?published_at IS NOT NULL
is equivalent tostatus = 'published'
?product_eds_events
has a large value forRows
; please tell us what the table is about, and provideSHOW CREATE TABLE
.JOINs
into subqueries; see below.below...
The reason for this is... You have an "explode-implode" caused by
JOIN
plusGROUPing
. First, theJOIN
explodes the number of 'rows' and builds a temporary table for them. Then theGROUPing
(GROUP_CONCAT
, in this case), implodes. The better (I hope) approach would be:That particular subquery needs
INDEX(event_id, event_category_id)
in this order.(Apply most of my suggestions, then start a fresh Question with fresh
CREATE TABLEs
,EXPLAINs
, etc. Augmenting this Question could get messy and hard to follow.)