Mysql – Slow MySQL query with float using BETWEEN and ORDER BY

MySQL

Trying to figure out whether it's the BETWEEN being used on a float or the ORDER BY and the lack of an index that is making this query slow.

First the tables used in the BETWEEN and ORDER BY:

mysql> explain objects;
+---------------------------+----------------------+------+-----+---------+----------------+
| Field                     | Type                 | Null | Key | Default | Extra          |
+---------------------------+----------------------+------+-----+---------+----------------+
| object_id                 | int(10) unsigned     | NO   | PRI | NULL    | auto_increment |
| type_id                   | smallint(5) unsigned | NO   | MUL | 0       |                |
| detached                  | tinyint(1)           | YES  |     | 0       |                |
| private_attach            | varchar(16)          | YES  |     | No      |                |
| construction              | varchar(4)           | NO   |     | 0       |                |
| last_edit                 | datetime             | YES  | MUL | NULL    |                |
| creation                  | datetime             | YES  | MUL | NULL    |                |
| last_edit_user_id         | int(10) unsigned     | NO   |     | 0       |                |
| created_user_id           | int(10) unsigned     | NO   | MUL | 0       |                |
| object_name               | varchar(255)         | YES  | MUL | NULL    |                |
| query_name                | varchar(255)         | YES  |     | NULL    |                |
| object_body_ft            | mediumtext           | YES  |     | NULL    |                |
| object_body_nft           | text                 | YES  | MUL | NULL    |                |
| media_file                | varchar(255)         | YES  |     | NULL    |                |
| latitude                  | float(8,5)           | YES  | MUL | NULL    |                |
| longitude                 | float(8,5)           | YES  | MUL | NULL    |                |
| country                   | varchar(255)         | YES  | MUL | NULL    |                |
| state_province            | varchar(255)         | YES  | MUL | NULL    |                |
| county                    | varchar(100)         | YES  |     | NULL    |                |
| elevation                 | float                | YES  |     | NULL    |                |
| continent                 | varchar(255)         | YES  |     | NULL    |                |
| tags                      | text                 | YES  | MUL | NULL    |                |
| tag_text                  | text                 | YES  |     | NULL    |                |
| route_type                | varchar(255)         | YES  |     | NULL    |                |
| activities                | varchar(255)         | YES  |     | NULL    |                |
| seasons                   | varchar(255)         | YES  |     | NULL    |                |
| time_required             | varchar(255)         | YES  |     | NULL    |                |
| difficulty                | varchar(255)         | YES  |     | NULL    |                |
| num_pitches               | int(11)              | YES  |     | NULL    |                |
| date_climbed              | date                 | YES  |     | NULL    |                |
| primary_photo             | varchar(60)          | YES  |     | NULL    |                |
| grade                     | varchar(255)         | YES  |     | NULL    |                |
| rock_difficulty           | varchar(255)         | YES  |     | NULL    |                |
| obj_type_name             | varchar(255)         | YES  |     | NULL    |                |
| is_primary_photo_user_set | tinyint(1)           | NO   |     | 0       |                |
| province                  | varchar(255)         | YES  |     | NULL    |                |
| show_toc                  | varchar(4)           | NO   |     | 0       |                |
| gpx_file                  | varchar(25)          | YES  |     | NULL    |                |
+---------------------------+----------------------+------+-----+---------+----------------+

mysql> explain object_scores;
+--------------------+------------------+------+-----+---------+-------+
| Field              | Type             | Null | Key | Default | Extra |
+--------------------+------------------+------+-----+---------+-------+
| object_id          | int(10) unsigned | NO   | PRI | 0       |       |
| hits               | int(10) unsigned | NO   | MUL | 0       |       |
| score              | float            | NO   | MUL | 0       |       |
| weighted_num_votes | float            | NO   |     | 0       |       |
| weighted_vote_sum  | float            | NO   |     | 0       |       |
+--------------------+------------------+------+-----+---------+-------+

And now the query:

SELECT SQL_CALC_FOUND_ROWS 
`objects`.`query_name`, 
`objects`.`primary_photo`, 
`object_scores`.`hits` AS `hits`, 
`object_scores`.`score` AS `score`, 
`types`.`type_name`, 
`users`.`username` AS `contributor`, 
`objects`.`object_id`, 
`users`.`user_id` AS `contributor_id`, 
`objects`.`object_name`, 
`objects`.`media_file`, 
`objects`.`last_edit`, 
`objects`.`creation`, 
`objects`.`created_user_id`, 
`objects`.`construction`, 
`objects`.`latitude`, 
`objects` .`longitude`, 
ABS(3963.1 * ACOS(COS(RADIANS(`objects`.`latitude`)) * COS(RADIANS(`objects`.`longitude`)) * COS(RADIANS(45.89957)) * COS(RADIANS(7.89368)) + COS(RADIANS(`objects`.`latitude`)) * SIN(RADIANS(`objects`.`longitude`)) * COS(RADIANS(45.89957)) * SIN(RADIANS(7.89368)) + SIN(RADIANS(`objects`.`latitude`)) * SIN(RADIANS(45.89957)))) AS `distance` 

FROM `users` 
JOIN `types` 
JOIN `objects` FORCE INDEX (`longitude`) 
JOIN `object_scores` ON `object_scores`.`object_id`=`objects`.`object_id` 

WHERE `users`.`user_id`=`objects`.`created_user_id` 
AND `types`.`type_id`=`objects`.`type_id` 
AND `objects`.`construction`!='Yes' 
AND `objects`.`detached`!=1 
AND `objects`.`latitude` IS NOT NULL 
AND `objects` .`longitude` IS NOT NULL 
AND `objects`.`latitude` BETWEEN 44.453844928718 AND 47.345295071282 
AND `objects`.`longitude` BETWEEN 5.8162446056357 AND 9.9711153943643 
AND ABS(3963.1 * ACOS(COS(RADIANS(`objects`.`latitude`)) * COS(RADIANS(`objects`.`longitude`)) * COS(RADIANS(45.89957)) * COS(RADIANS(7.89368)) + COS(RADIANS(`objects`.`latitude`)) * SIN(RADIANS(`objects`.`longitude`)) * COS(RADIANS(45.89957)) * SIN(RADIANS(7.89368)) + SIN(RADIANS(`objects`.`latitude`)) * SIN(RADIANS(45.89957)))) < 100 
AND `objects`.`type_id`=3 

ORDER BY `object_scores`.`score` ASC LIMIT 20310, 30;

And the EXPLAIN on the query:

+----+-------------+---------------+--------+---------------+-----------+---------+-----------------------------+-------+---------------------------------+
| id | select_type | table         | type   | possible_keys | key       | key_len | ref                         | rows  | Extra                           |
+----+-------------+---------------+--------+---------------+-----------+---------+-----------------------------+-------+---------------------------------+
|  1 | SIMPLE      | types         | const  | PRIMARY       | PRIMARY   | 2       | const                       |     1 | Using temporary; Using filesort |
|  1 | SIMPLE      | objects       | range  | longitude     | longitude | 5       | NULL                        | 59316 | Using where                     |
|  1 | SIMPLE      | users         | eq_ref | PRIMARY       | PRIMARY   | 4       | cms.objects.created_user_id |     1 |                                 |
|  1 | SIMPLE      | object_scores | eq_ref | PRIMARY       | PRIMARY   | 4       | cms.objects.object_id       |     1 |                                 |
+----+-------------+---------------+--------+---------------+-----------+---------+-----------------------------+-------+---------------------------------+

The objects table has approximately 760,000 entries in it, as does the object_scores table. I'm trying to figure out why so many rows are examined (59316) while doing the query. I suspect it has something to do with the indices on the latitude or longitude or indices with relation to the ORDER BY.

I've tried removing the FORCE INDEX (longitude), but that results in a slower query in most cases.

Suggestions anyone?

Best Answer

No index in the objects table is selective enough to restrict the search to a small number of rows, that's why the query is slow. The engine has to either do a full scan of the table or use an index (which is probably inefficient).

If it uses an index, like the one on (longitude), it can find (fast, using the index) the 59316 rows that have longitude BETWEEN 5.8162446056357 AND 9.9711153943643 but then the corresponding 59316 rows have to be read from the table (and they are probably scattered around the table so a very large portion of the table has to be read. Further all these rows have to be checked for all the other conditions you have in the WHERE and then ordered (Using filesort).

Your best bet is to add a spatial index. This is possible only if the table uses the MyISAM engine (otherwise you could split the table into 2, one as it is with all the columns and using InnoDB and one using MyiSAM and only the columns needed for the spatial index).