I am performing the following query on a MySQL table with around a million records ( ~350MB ). It takes roughly 1.5 seconds every time. If I remove the OR last_updated > last_geocoded
comparison, the query time improves by an order of magnitude, to less than 100ms. If I remove the last_geocoded = 0
line, the query time stays at roughly 1.5 seconds.
SELECT id, company_id, raw_address
FROM locations
WHERE
raw_address IS NOT NULL
AND (
last_geocoded = 0
OR last_updated > last_geocoded
)
LIMIT 25
If I reduce the query to just WHERE last_updated > last_geocoded
, the query time is less than 100ms.
SELECT id, company_id, raw_address
FROM locations
WHERE last_updated > last_geocoded
LIMIT 25
How can I speed up the original query? Why is that WHERE statement slowing down my query so much?
Below is the table schema showing my keys:
CREATE TABLE `locations` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`company_id` int(11) unsigned DEFAULT NULL,
`title` varchar(255) DEFAULT NULL,
`description` text,
`raw_address` varchar(255) DEFAULT NULL,
`street` varchar(255) DEFAULT NULL,
`city` varchar(255) DEFAULT NULL,
`city_code` varchar(255) DEFAULT NULL,
`county` varchar(255) DEFAULT NULL,
`county_code` varchar(255) DEFAULT NULL,
`state` varchar(255) DEFAULT NULL,
`state_code` varchar(255) DEFAULT NULL,
`country` varchar(255) DEFAULT NULL,
`country_code` varchar(255) DEFAULT NULL,
`constituency` varchar(255) DEFAULT NULL,
`constituency_code` varchar(255) DEFAULT NULL,
`postal_code` int(5) unsigned zerofill DEFAULT NULL,
`lat` float(10,6) DEFAULT NULL,
`lon` float(10,6) DEFAULT NULL,
`headquarters` tinyint(1) unsigned DEFAULT NULL,
`research_and_development` tinyint(1) unsigned DEFAULT NULL,
`manufacturing` tinyint(1) unsigned DEFAULT NULL,
`distribution` tinyint(1) unsigned DEFAULT NULL,
`sales` tinyint(1) unsigned DEFAULT NULL,
`retail` tinyint(1) unsigned DEFAULT NULL,
`source_id` int(11) unsigned DEFAULT NULL,
`last_updated` datetime NOT NULL,
`last_geocoded` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `company_id` (`company_id`),
KEY `raw_address` (`raw_address`),
KEY `last_updated` (`last_updated`),
KEY `last_geocoded` (`last_geocoded`),
KEY `city` (`city`),
KEY `county` (`county`),
KEY `state` (`state`),
KEY `country` (`country`),
KEY `postal_code` (`postal_code`)
) ENGINE=InnoDB AUTO_INCREMENT=997680 DEFAULT CHARSET=latin1;
Below is the original query run with EXPLAIN
:
"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
1,"SIMPLE","locations","range","raw_address,last_geocoded","raw_address",258,NULL,509543,"Using where"
Best Answer
Take the 100ms query and
LEFT JOIN
it back to thelocations
table:Give it a Try !!!