I have a structure similar to this one:
CREATE TABLE `author` (
`id` int(11) unsigned NOT NULL auto_increment,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1;
CREATE TABLE `book` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`author_id` int(11) unsigned NOT NULL,
`org` int(11) unsigned NOT NULL,
`country` char(3) NOT NULL,
`publish_date` date NOT NULL,
`price` decimal(6,2) NOT NULL,
PRIMARY KEY (`id`),
KEY `author_id` (`author_id`),
KEY `publish_date` (`publish_date`),
KEY `i0` (`country`, `org`, `author_id`, `price`, `publish_date`),
KEY `i1` (`country`, `org`, `author_id`, `publish_date`, `price`)
) ENGINE=InnoDB CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1;
Here is a SQL Fiddle
So I'm trying to execute a query with two range conditions
SELECT
id as author_id,
(SELECT COUNT(DISTINCT `book`.`id`)+1
FROM `book`
WHERE
`book`.`org` = 1
AND `book`.`country` = 'USA'
AND `book`.`publish_date` BETWEEN '2010-04-30' AND '2011-04-30'
AND `book`.`author_id` = `author`.`id`
AND `book`.`price` < 50
) AS `books_under_fifty`
FROM `author`
ORDER BY books_under_fifty desc;
but the optimizer uses only part of my index: const,const,db_9_6349e2.author.id
from i0
is there a way to optimize it?
Best Answer
Well, let's think out of the box. It seems that there are two sets of results -- the authors with some such books, and those without. The first case is more efficiently done via:
That would probably be best served by
If you are happy with the performance of that, then you can think out of the box in finding "the rest of the authors".
Back to the 2-ranges problem. That can sometimes be solved by Partitioning. Not knowing the distribution of the data (is one year a small subset of the total? or what about price?), I can't say which would be better:
Partitioning by price is out, since
DECIMAL
can't be used. Storingprice
as number of cents and usingPARTITION BY RANGE(cents)
would work, but clumsily.Then make 20-50 partitions of the partition key. This would give you "partition pruning" for one 'range', but then the index needs to be tweaked for the other. Assuming you partition by
publish_date
:Meanwhile, ...
Don't use utf8 if you are using standard country_codes; use ascii, at least for that column.
What is
org
does it distinguish the book? Or are there some missing columns? Is there no other "unique" column(s)?Another variant that might be better:
This variant assumes you have no 'natural' PK. And it will be more efficient because it clusters the desired rows together. I did not include
author_id
in the PK since it is readily available in the row.add index
ADD INDEX ac(author_id, country)
tricked it into usingi0
. Go figure. Note: I left(author_id)
intact and first in the list of indexes; so it is not just the order of the indexes.I'm using Percona's 5.6.22-71.0-log .
Note that there are only 4 different
country
values, evenly distributed. But there are many differentauthor_id
, making it by itself more "selective".