You must refactor the query to perform WHERE, JOIN and GROUP BY clauses IN THAT EXACT ORDER !!!
Here is the refactored query :
select
sq.search_id,
count(1) search_id_count
from
(select search_id from search where search_id > 47000) sq
inner join
(select search_id from search_hit where search_id > 47000) ssh
using (search_id)
group by
sq.search_id;
Running OPTIMIZE TABLE (reduces table fragmentation and recreate indexes) may not be necessary unless you do heavy INSERTs, UPDATEs, DELETEs. I'll say it is optional.
Give it a Try !!!
Add an index on (tag, hash)
ALTER TABLE tags
DROP INDEX tag,
ADD UNIQUE INDEX tag_hash_UX
(tag, `hash`) ;
Then try this query:
SELECT u.md5, u.url, u.title, u.numsaves
FROM urls AS u
JOIN tags AS t1 ON t1.hash = u.md5
AND t1.tag = 'php'
JOIN tags AS t2 ON t2.hash = u.md5
AND t2.tag = 'tutorials'
ORDER BY u.numsaves DESC
LIMIT 20 ;
and after adding another index:
ALTER TABLE urls
DROP INDEX md5,
ADD INDEX md5_numsaves_IX
(md5, numsaves) ;
try this variation (explanation below, at point 3):
SELECT u.md5, u.url, u.title, u.numsaves
FROM urls AS u
JOIN
( SELECT ui.md5, ui.numsaves
FROM urls AS ui
JOIN tags AS t1 ON t1.hash = ui.md5
AND t1.tag = 'php'
JOIN tags AS t2 ON t2.hash = ui.md5
AND t2.tag = 'tutorials'
ORDER BY ui.numsaves DESC
LIMIT 20
) AS ulim ON ulim.md5 = u.md5
ORDER BY ulim.numsaves DESC ;
Other things you should consider:
- You have 2 identical indexes on
URLS (md5)
, one Primary and one Unique. You don't need both, you can safely delete the redundant (unique) index.
Why was md5
was chosen as the primary key in the first place? Why not a surrogate (auto-incrementing) integer column? That is usually a better choice for primary key in InnODB tables because the primary key is also by default the clustered index of the table.
If you make that change (add an url_id
column and make it PK), you will also need to alter the TAGS
table by adding a url_id
as well and removing the hash
column. This would help efficiency, too. Indexes on int
columns take much less space than indexes on VARCHAR(255)
columns. 4 bytes vs. 757 bytes wide is a lot of difference.
Read this blog post about text and blob columns in InnoDB: Handling long texts/blobs in InnoDB If the issue turns out to be the text columns, you could split the table into two tables, having an 1-1 relationship between them, so text columns are separately from the narrow columns (or try the index I suggested, which is a similar fix for this.)
Best Answer
If you just wish to create a single index, CREATE INDEX would be the way to go.
However, if you wish to setup your table with multiple indices, setting up a new empty table and then copying from your old to your new table is IMO a more practical solution, since you can get everything ready and then simply run the insert when it's convenient.