MySQL Single Left Join Very Slow

MySQLoptimizationperformancequery-performance

SELECT
    contacts.email, emails_stats.id FROM contacts
LEFT JOIN
    emails_stats ON contacts.id = emails_stats.contact_id
WHERE
    contacts.client_code = 121212121212
GROUP BY
    contacts.email;

This query taken 10.2s

Table contacts:

Rows: ~74k; Indexes: primary and client_code

Table emails_stats

Rows: ~20k; Indexes: primary and contact_id

The query explain:

1   SIMPLE  contacts    ref client_code client_code 9   const   2104    Using temporary; Using filesort
1   SIMPLE  emails_stats    index   NULL    contact_id  36  NULL    19394   Using where; Using index; Using join buffer (Block Nested Loop)

SHOW CREATE TABLE contacts:

CREATE TABLE `contacts` (
  `id` char(36) NOT NULL,
  `name` varchar(255) DEFAULT NULL,
  `email` varchar(255) DEFAULT NULL,
  `client_code` bigint(20) DEFAULT NULL,
  `created` datetime DEFAULT NULL,
  `modified` datetime DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

SHOW CREATE TABLE emails_stats:

CREATE TABLE `emails_stats` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `email_event_id` int(11) NOT NULL,
  `contact_id` char(36) NOT NULL,
  `created` datetime NOT NULL,
  `modified` datetime NOT NULL,
  PRIMARY KEY (`id`),
  KEY `email_event_id` (`email_event_id`),
  KEY `contact_id` (`contact_id`),
) ENGINE=InnoDB AUTO_INCREMENT=19194 DEFAULT CHARSET=latin1

Expected Result:

email                       id
aaaaaaaaa99@yahoo.com.br    136
bbbbbbbbb99@yahoo.com.br    365
ccccccccc99@yahoo.com.br    765

Best Answer

Because you are grouping by contracts.email, there needs to be an aggregate expression over emails_stats.id, which I've chosen as GROUP_CONCAT:

SELECT
    contacts.email, GROUP_CONCAT(emails_stats.id) FROM contacts
LEFT JOIN
    emails_stats ON contacts.id = emails_stats.contact_id
WHERE
    contacts.client_code = 121212121212
GROUP BY
    contacts.email;

As this lookup is by client_code, to enable this to perform, creating an index on client_code needs to be at the first part of the index. As contacts.email is used by the GROUP BY and SELECT results, putting email at the end of the index will save a secondary lookup of the email, and being in the index, its already sorted and hence grouped together.

CREATE INDEX client_code_email ON contacts(client_code,email)

This will enable the WHERE clause to be fast. In Innodb which is used, the contacts.id is the PK at the end of the key, this is use the JOIN which is ideal.

The JOIN to email_stats is on contract_id which already has an index. email_stats.id is the PK already at the end of this index which gets used in the result set.