MySQL Performance issue: ~360ms to fetch 120 rows with 10 columns

indexindex-tuningMySQLmysql-5.6performancequery-performance

I have the following tables:

CREATE TABLE `events` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `status` varchar(255) COLLATE utf8_unicode_ci DEFAULT 'draft',
  `publish_at` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `index_events_on_status` (`status`),
  KEY `index_events_on_status_and_publish_at` (`status`,`publish_at`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=32092 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE `event_dates` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `start_date` date NOT NULL,
  `end_date` date DEFAULT NULL,
  `event_id` int(11) NOT NULL,
  `venue_id` int(11) DEFAULT NULL,
  `alt_venue_id` int(11) DEFAULT NULL,
  `allday` tinyint(1) DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `event_dates_event_id_fk` (`event_id`),
  KEY `event_dates_venue_id_fk` (`venue_id`),
  KEY `event_dates_alt_venue_id_fk` (`alt_venue_id`),
  KEY `index_event_dates_on_start_date` (`start_date`),
  KEY `index_event_dates_on_end_date` (`end_date`),
  KEY `index_event_dates_on_start_date_and_end_date` (`start_date`,`end_date`),
  CONSTRAINT `event_dates_alt_venue_id_fk` FOREIGN KEY (`alt_venue_id`) REFERENCES `venues` (`id`) ON DELETE SET NULL,
  CONSTRAINT `event_dates_event_id_fk` FOREIGN KEY (`event_id`) REFERENCES `events` (`id`) ON DELETE CASCADE,
  CONSTRAINT `event_dates_venue_id_fk` FOREIGN KEY (`venue_id`) REFERENCES `venues` (`id`) ON DELETE SET NULL
) ENGINE=InnoDB AUTO_INCREMENT=152634 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE `event_times` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `time_start` time NOT NULL,
  `time_end` time DEFAULT NULL,
  `date_id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `event_times_date_id_fk` (`date_id`),
  CONSTRAINT `event_times_date_id_fk` FOREIGN KEY (`date_id`) REFERENCES `event_dates` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=66391 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE `venues` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
  `city_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `venues_city_id_fk` (`city_id`),
  KEY `venues_to_cities` (`city_id`),
  CONSTRAINT `venues_to_cities` FOREIGN KEY (`city_id`) REFERENCES `cities` (`id`) ON DELETE SET NULL ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=3083 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

Events can happen multiple times (even more than once per day) and in different venues (the table event_dates is used to define this concept).

The basic use-case in my application is to get the list of events that are going to happen after a certain date (e.g. show me all events form today onwards).

I use the following query to get the relevant ids of the matching records.

Better said, this query outputs all events occurrences happening from a given date onwards, where an occurrences is just a event_date (if the event takes place only once on the event_date) or a tuple [event_date, event_time] (if the event takes place multiple times on the same event_date).

SELECT 
events.id AS event_id,
CONCAT(event_dates.id, '::', IF(event_times.id IS NULL, '', event_times.id)) AS occurrence_id,
venues.id AS venue_id, alt_venues.id AS alt_venue_id,
event_dates.start_date AS event_dates_start_date, event_dates.allday AS event_dates_allday, event_times.time_start AS event_times_time_start 
FROM `event_dates` 
INNER JOIN `events` ON `events`.`id` = `event_dates`.`event_id` 
LEFT JOIN event_times ON event_times.date_id = event_dates.id 
INNER JOIN venues ON event_dates.venue_id = venues.id 
LEFT JOIN venues AS alt_venues ON event_dates.alt_venue_id = alt_venues.id 
WHERE 
    (events.status = 'published' AND (events.publish_at IS NULL OR events.publish_at <= NOW()))
    AND (1 = 1) AND (1 = 1) AND 
    (
        ('2015-06-27' <= event_dates.start_date)
         OR
        (event_dates.end_date is NOT NULL AND '2015-06-27' <= event_dates.end_date)
    ) 
LIMIT 0, 100000

MySQL takes ~40ms to return 8.4K rows.

The output of EXPLAIN EXTENDED is:

+----+-------------+-------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+---------+-----------------------------------------+------+----------+------------------------------------+
| id | select_type | table       | type   | possible_keys                                                                                                                                              | key                     | key_len | ref                                     | rows | filtered | Extra                              |
+----+-------------+-------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+---------+-----------------------------------------+------+----------+------------------------------------+
|  1 | SIMPLE      | events      | ref    | PRIMARY,index_events_on_status,index_events_on_status_and_publish_at                                                                                       | index_events_on_status  | 768     | const                                   | 7467 |   100.00 | Using index condition; Using where |
|  1 | SIMPLE      | event_dates | ref    | event_dates_event_id_fk,event_dates_venue_id_fk,index_event_dates_on_start_date,index_event_dates_on_end_date,index_event_dates_on_start_date_and_end_date | event_dates_event_id_fk | 4       | agenda_staging.events.id                |    1 |   100.00 | Using where                        |
|  1 | SIMPLE      | venues      | eq_ref | PRIMARY                                                                                                                                                    | PRIMARY                 | 4       | agenda_staging.event_dates.venue_id     |    1 |   100.00 | Using index                        |
|  1 | SIMPLE      | alt_venues  | eq_ref | PRIMARY                                                                                                                                                    | PRIMARY                 | 4       | agenda_staging.event_dates.alt_venue_id |    1 |   100.00 | Using index                        |
|  1 | SIMPLE      | event_times | ref    | event_times_date_id_fk                                                                                                                                     | event_times_date_id_fk  | 4       | agenda_staging.event_dates.id           |    1 |   100.00 | NULL                               |
+----+-------------+-------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+---------+-----------------------------------------+------+----------+------------------------------------+

I'm not a DB expert at all, but looking at type and extra columns I understand that MySQL is properly using my indexes.

However, in all the perf-related material I found googling, 8K is a very tiny result-set.

Given that the hardware I'm on is a 4-core SSD-based MacBook Pro, I believe that it should be an order of magnitude faster.

40ms alone would not be a problem.

However in the application I need data from various other tables for every event.
I fetch that data from the following query, which uses the one above as a subquery.

SELECT  
dative.occurrence_id, dative.event_id, dative.venue_id, dative.alt_venue_id, images.id AS image_id,
GROUP_CONCAT(DISTINCT event_categories_events.event_category_id ORDER BY event_category_id) AS event_category_ids,
COUNT(DISTINCT events_promoters.promoter_id) AS promoters_count,
GROUP_CONCAT(DISTINCT event_texts.id) AS event_text_ids,
GROUP_CONCAT(DISTINCT communications.id ORDER BY communications.id) AS communication_ids,
GROUP_CONCAT(DISTINCT events_tags.tag_id ORDER BY tag_id) AS tag_ids,
GROUP_CONCAT(DISTINCT collection_eds_events.collection_edition_id ORDER BY collection_edition_id) AS collection_edition_ids
FROM (
    SELECT 
    events.id AS event_id,
    CONCAT(event_dates.id, '::', IF(event_times.id IS NULL, '', event_times.id)) AS occurrence_id,
    venues.id AS venue_id, alt_venues.id AS alt_venue_id
  , event_dates.start_date AS event_dates_start_date, event_dates.allday AS event_dates_allday, event_times.time_start AS event_times_time_start 
    FROM `event_dates` 
    INNER JOIN `events` ON `events`.`id` = `event_dates`.`event_id` 
    LEFT JOIN event_times ON event_times.date_id = event_dates.id 
    INNER JOIN venues ON event_dates.venue_id = venues.id 
    LEFT JOIN venues AS alt_venues ON event_dates.alt_venue_id = alt_venues.id 
    WHERE 
        (events.status = 'published' AND (events.publish_at IS NULL OR events.publish_at <= NOW()))
        AND (1 = 1) AND (1 = 1) AND 
        (
            ('2015-06-27' <= event_dates.start_date)
             OR
            (event_dates.end_date is NOT NULL AND '2015-06-27' <= event_dates.end_date)
        ) 
    )
AS dative 
INNER JOIN events ON events.id = dative.event_id 
INNER JOIN `product_eds_events` ON `product_eds_events`.`event_id` = `events`.`id` 
INNER JOIN `product_editions` ON `product_editions`.`id` = `product_eds_events`.`product_edition_id` 
INNER JOIN `images` ON `images`.`event_id` = `events`.`id` 
INNER JOIN event_texts ON event_texts.event_id = events.id AND event_texts.language = 'it' 
LEFT JOIN collection_eds_events ON collection_eds_events.event_id = events.id 
LEFT JOIN collection_editions ON collection_editions.id = collection_eds_events.collection_edition_id 
LEFT JOIN communications ON communications.event_id = events.id AND communications.status = "published" AND communications.publish_at <= NOW() AND communications.expire_at >= CURDATE()
INNER JOIN event_categories_events ON event_categories_events.event_id = events.id 
INNER JOIN events_promoters ON events_promoters.event_id = events.id 
LEFT JOIN events_tags ON events_tags.event_id = events.id 
WHERE `product_editions`.`product_id` = 1 
GROUP BY dative.occurrence_id
ORDER BY event_dates_start_date ASC, event_dates_allday ASC, event_times_time_start ASC
LIMIT 0, 100000

This query takes 340ms to return 8.2K rows.

And this is the EXPLAIN EXTENDED output:

+----+-------------+-------------------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+---------+------------------------------------------------------------+------+----------+----------------------------------------------+
| id | select_type | table                   | type   | possible_keys                                                                                                                                              | key                                                | key_len | ref                                                        | rows | filtered | Extra                                        |
+----+-------------+-------------------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+---------+------------------------------------------------------------+------+----------+----------------------------------------------+
|  1 | PRIMARY     | product_editions        | ref    | PRIMARY,product_editions_product_id_fk                                                                                                                     | product_editions_product_id_fk                     | 5       | const                                                      |    4 |   100.00 | Using index; Using temporary; Using filesort |
|  1 | PRIMARY     | product_eds_events      | ref    | index_unique_product_editions_events,product_eds_events_event_id_fk                                                                                        | index_unique_product_editions_events               | 5       | agenda_staging.product_editions.id                         | 1059 |   100.00 | Using where; Using index                     |
|  1 | PRIMARY     | event_texts             | ref    | index_event_texts_on_event_id_and_language                                                                                                                 | index_event_texts_on_event_id_and_language         | 773     | agenda_staging.product_eds_events.event_id,const           |    1 |   100.00 | Using where; Using index                     |
|  1 | PRIMARY     | events                  | eq_ref | PRIMARY                                                                                                                                                    | PRIMARY                                            | 4       | agenda_staging.product_eds_events.event_id                 |    1 |   100.00 | Using index                                  |
|  1 | PRIMARY     | event_categories_events | ref    | event_categories_events_event_id_fk                                                                                                                        | event_categories_events_event_id_fk                | 5       | agenda_staging.product_eds_events.event_id                 |    1 |   100.00 | NULL                                         |
|  1 | PRIMARY     | images                  | ref    | images_event_id_fk                                                                                                                                         | images_event_id_fk                                 | 5       | agenda_staging.product_eds_events.event_id                 |    1 |   100.00 | Using index                                  |
|  1 | PRIMARY     | events_promoters        | ref    | index_events_promoters_on_event_id_and_promoter_id                                                                                                         | index_events_promoters_on_event_id_and_promoter_id | 4       | agenda_staging.product_eds_events.event_id                 |    1 |   100.00 | Using index                                  |
|  1 | PRIMARY     | collection_eds_events   | ref    | collection_eds_events_event_id_fk                                                                                                                          | collection_eds_events_event_id_fk                  | 5       | agenda_staging.product_eds_events.event_id                 |    1 |   100.00 | NULL                                         |
|  1 | PRIMARY     | collection_editions     | eq_ref | PRIMARY                                                                                                                                                    | PRIMARY                                            | 4       | agenda_staging.collection_eds_events.collection_edition_id |    1 |   100.00 | Using index                                  |
|  1 | PRIMARY     | communications          | ref    | communications_event_id_fk                                                                                                                                 | communications_event_id_fk                         | 5       | agenda_staging.product_eds_events.event_id                 |    1 |   100.00 | Using where                                  |
|  1 | PRIMARY     | events_tags             | ref    | index_events_tags_on_event_id_and_tag_id                                                                                                                   | index_events_tags_on_event_id_and_tag_id           | 4       | agenda_staging.product_eds_events.event_id                 |    1 |   100.00 | Using index                                  |
|  1 | PRIMARY     | <derived2>              | ref    | <auto_key0>                                                                                                                                                | <auto_key0>                                        | 4       | agenda_staging.product_eds_events.event_id                 |   10 |   100.00 | NULL                                         |
|  2 | DERIVED     | events                  | ref    | PRIMARY,index_events_on_status,index_events_on_status_and_publish_at                                                                                       | index_events_on_status                             | 768     | const                                                      | 7467 |   100.00 | Using index condition; Using where           |
|  2 | DERIVED     | event_dates             | ref    | event_dates_event_id_fk,event_dates_venue_id_fk,index_event_dates_on_start_date,index_event_dates_on_end_date,index_event_dates_on_start_date_and_end_date | event_dates_event_id_fk                            | 4       | agenda_staging.events.id                                   |    1 |   100.00 | Using where                                  |
|  2 | DERIVED     | venues                  | eq_ref | PRIMARY                                                                                                                                                    | PRIMARY                                            | 4       | agenda_staging.event_dates.venue_id                        |    1 |   100.00 | Using index                                  |
|  2 | DERIVED     | alt_venues              | eq_ref | PRIMARY                                                                                                                                                    | PRIMARY                                            | 4       | agenda_staging.event_dates.alt_venue_id                    |    1 |   100.00 | Using index                                  |
|  2 | DERIVED     | event_times             | ref    | event_times_date_id_fk                                                                                                                                     | event_times_date_id_fk                             | 4       | agenda_staging.event_dates.id                              |    1 |   100.00 | NULL                                         |
+----+-------------+-------------------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+---------+------------------------------------------------------------+------+----------+----------------------------------------------+

Again, it seems to me that the indexes are being used.

Am I missing some obvious issue here?
Or are these timings to be expected?

All tables are InnoDB with indexed foreign keys.

I am sorry for the very long question.

I did not include the DDL for the tables in the second query for brevity's sake.

If they can of help, I will edit my question with a like to a pastebin with the complete DDL.

Please I need help to solve this cause I won't get acceptable response times for my webapp otherwise.

Best Answer

  • Why have LIMIT 10000?
  • Can status be turned into an ENUM (1 byte) instead of a potentially large VARCHAR(255)?
  • It would probably be better to combine date and time. Joining to the 'time' table seems to add to the complexity and hurt performance.
  • It is essentially impossible to deal with (that is, use an index for) a start-end date (or datetime) range. Consider having one row per day, instead of one row for a range of days. Sure, that increases the number of rows in the table, but it will probably greatly decrease the time taken for certain queries.
  • What does NULL mean for end_time? For end_date?
  • Perhaps published_at IS NOT NULL is equivalent to status = 'published'?
  • product_eds_events has a large value for Rows; please tell us what the table is about, and provide SHOW CREATE TABLE.
  • It may be better to change certain JOINs into subqueries; see below.
  • Some of the tables smell like "many-to-many mapping" tables. See this for several optimization tips.

below...

SELECT ...,
    GROUP_CONCAT(DISTINCT event_categories_events.event_category_id
                 ORDER BY event_category_id) AS event_category_ids,
    ...
  INNER JOIN event_categories_events
          ON event_categories_events.event_id = events.id 
  ...

The reason for this is... You have an "explode-implode" caused by JOIN plus GROUPing. First, the JOIN explodes the number of 'rows' and builds a temporary table for them. Then the GROUPing (GROUP_CONCAT, in this case), implodes. The better (I hope) approach would be:

SELECT ...,
    ( SELECT GROUP_CONCAT(
                 DISTINCT event_category_id
                 ORDER BY event_category_id)
          FROM event_categories_events
          WHERE event_id = events.id
    ) AS event_category_ids,
    ...
  -- no JOIN with event_categories_events
  ...

That particular subquery needs INDEX(event_id, event_category_id) in this order.

(Apply most of my suggestions, then start a fresh Question with fresh CREATE TABLEs, EXPLAINs, etc. Augmenting this Question could get messy and hard to follow.)