LEFT JOIN is supposed to be faster than a NOT IN query. This is what I've read online. But practically, this seems to be opposite. I have the following query here,
INSERT IGNORE INTO callrequests (contact, call_status, campaign_id, created_date, active_call)
SELECT contact, '1', '7', '2013-10-20 13:40:51', '0' FROM contacts c WHERE phonebook_id = '62' and c.`contact` NOT IN (SELECT contact FROM dnc_contacts d WHERE group_id = '3') ON DUPLICATE KEY UPDATE call_status = if( call_status = '2', VALUES(call_status), call_status);
which takes around 7 seconds to run. But if I use the same query removing NOT IN, inserting a LEFT JOIN,
INSERT IGNORE callrequests (contact, call_status, campaign_id, created_date, active_call) SELECT c.contact, '1', '7', '2013-10-20 13:40:51', '0' FROM contacts c
LEFT JOIN `dnc_contacts` as B on c.`contact` <> B.`contact` WHERE c.phonebook_id = '62' and B.`group_id` = '3' ON DUPLICATE KEY UPDATE call_status = if( call_status = '2', VALUES(call_status), call_status);
show create table callrequests
CREATE TABLE `callrequests` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`contact` varchar(45) NOT NULL,
`call_status` int(11) NOT NULL,
`campaign_id` int(11) NOT NULL,
`created_date` timestamp NULL DEFAULT NULL,
`updated_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`active_call` tinyint(1) DEFAULT NULL,
`uuid` varchar(60) NOT NULL DEFAULT '',
PRIMARY KEY (`id`,`campaign_id`),
UNIQUE KEY `unique_contact` (`campaign_id`,`contact`),
KEY `fk_callrequest_campaign1_idx` (`campaign_id`),
KEY `active_calls` (`active_call`),
KEY `idx_call_status` (`call_status`),
CONSTRAINT `fk_callrequest_campaign1` FOREIGN KEY (`campaign_id`) REFERENCES `campaigns` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=16384 DEFAULT CHARSET=latin1;
show create table contacts
CREATE TABLE `contacts` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`contact` varchar(45) NOT NULL,
`phonebook_id` int(11) NOT NULL,
`created_date` timestamp NULL DEFAULT NULL,
`updated_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT ' ',
`status` tinyint(1) NOT NULL DEFAULT '1',
`admin` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`,`phonebook_id`) USING BTREE,
KEY `phonebook_id_contact_INDEX` (`contact`,`status`),
KEY `contact_INDEX` (`contact`),
KEY `fk_contact_phonebook1_idx` (`phonebook_id`),
CONSTRAINT `fk_contact_phonebook1` FOREIGN KEY (`phonebook_id`) REFERENCES `phonebooks` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=2484174 DEFAULT CHARSET=latin1;
show create table dnc_contacts
CREATE TABLE `dnc_contacts` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`group_id` int(11) NOT NULL,
`contact` varchar(45) NOT NULL,
`status` tinyint(1) NOT NULL DEFAULT '1',
`created_date` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`admin` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`,`group_id`) USING BTREE,
KEY `group_id_satus_contact_INDEX` (`status`),
KEY `fk_dnc_contact_dnc_group1_idx` (`group_id`),
CONSTRAINT `fk_dnc_contact_dnc_group1` FOREIGN KEY (`group_id`) REFERENCES `dnc_groups` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=32573 DEFAULT CHARSET=latin1;
This takes ages to complete. Now the interesting part is that the first query runs very fast (7 seconds) on Server version: 5.6.13-rel61.0 Percona and if I run the same first query on Server version: 5.5.5-10.0.4-MariaDB takes a very long time. All server configs are same. Server is running with 32 GB of RAM with AMD Opteron 8 cores with RAID 1 SSD. MySQL config files are all same on all instances as
port = 1979
socket = /opt/percona/tmp/mysql.sock
skip-external-locking
default-storage-engine = InnoDB
tmpdir = /tmp/
innodb_data_home_dir = /opt/percona/data
innodb_log_group_home_dir = /opt/percona/data
key-buffer-size = 32M
myisam-recover = FORCE,BACKUP
max-allowed-packet = 16M
max-connect-errors = 1000000
tmp-table-size = 32M
max-heap-table-size = 32M
query-cache-type = 0
query-cache-size = 0
max-connections = 400
thread-cache-size = 50
open-files-limit = 65535
table-definition-cache = 1024
table-open-cache = 1024
innodb-flush-method = O_DIRECT
innodb-log-files-in-group = 2
innodb-log-file-size = 1G
innodb-flush-log-at-trx-commit = 2
innodb-file-per-table = 1
innodb-buffer-pool-size = 15G
innodb_thread_concurrency = 0
interactive_timeout = 60
wait_timeout = 60
connect_timeout = 5
innodb_lock_wait_timeout = 15
Can't seem to figure out what's going on. Tried to tune a few params but no luck. Is there a way we can improve the query or look at the server configs?
Thanks.
Best Answer
dnc_contacts would benefit from INDEX(group_id, contact) and/or INDEX(contact, group_id)