I have 2 tables, one for links (Links
) and one for comments (Comments
) where each comment contains a reference to a link (it can also point to NULL
if it has no associated link, such comments we disregard for a moment). I want to order links such that links with newest comments are on top.
Currently, I do this:
SELECT DISTINCT linkId FROM Comments WHERE
linkId IS NOT NULL AND
dateAdded <= ?
ORDER BY id DESC
LIMIT 10
This is working very well (around 1ms with current 60k rows, I've tested it also with 800k rows, it still performs just as well)
However, now I'd like to classify the links into different categories and therefore I created a new column category
in Links
table and I want to order the links within their category. I tried running following query:
SELECT DISTINCT c.linkId FROM Comments c LEFT JOIN Links l ON c.linkId = l.id WHERE
c.linkId IS NOT NULL AND
c.dateAdded <= ? AND
l.category = ?
ORDER BY c.id DESC
LIMIT 10
This makes the query execution time become ~1.5s with the table with 800k rows so it is obviously not scaling well.
The tables are defined as follows:
CREATE TABLE `Links` (
`id` int(10) UNSIGNED NOT NULL,
`category` int(10) UNSIGNED NOT NULL,
`url` varchar(2083) CHARACTER SET utf8mb4 NOT NULL,
`dateAdded` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE `Links`
ADD PRIMARY KEY (`id`),
ADD KEY `category` (`category`);
CREATE TABLE `Comments` (
`id` int(10) UNSIGNED NOT NULL,
`linkId` int(10) UNSIGNED DEFAULT NULL,
`content` varchar(2000) CHARACTER SET utf8mb4 NOT NULL,
`dateAdded` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE `Comments`
ADD PRIMARY KEY (`id`),
ADD KEY `idx2` (`linkId`,`dateAdded`);
ALTER TABLE `Comments`
ADD CONSTRAINT `fk_link_id` FOREIGN KEY (`linkId`) REFERENCES `Links` (`id`) ON DELETE CASCADE ON UPDATE CASCADE;
Is there any way how this query can be optimized?
I guess creating category
column in Comments
table would help but that could lead to inconsistencies (i.e. link pointing to one category but its links pointing to another category).
Best Answer
Because of filtering on two tables, it is not possible for a single index to handle all the
WHERE
, to handle theORDER BY, much less optimize the
LIMIT 10`.The
LEFT JOIN
seems to be aJOIN
.This link on
comments
may help:INDEX(dateAdded, linkId, id)
(in that order).If you want to discuss further, please provide
EXPLAIN SELECT ...
andEXPLAIN FORMAT=JSON SELECT ...