Please keep in mind that OPTIMIZE TABLE does not perform defragmentation. Internally, OPTIMIZE TABLE perform several operations (copying data to a temp file, recreate indexes, recompute index statistics). In fact, the example I have can be performed manually as shown.
Example: If you optimize mydb.mytable
, you enter this command:
OPTIMIZE TABLE mydb.mytable;
Note that mysql performs something the following under the hood:
CREATE TABLE mydb.mytable2 LIKE mydb.mytable;
ALTER TABLE mydb.mytable2 DISABLE KEYS;
INSERT INTO mydb.mytable2 SELECT * FROM mydb.mytable;
ALTER TABLE mydb.mytable2 ENABLE KEYS;
DROP TABLE mydb.mytable;
ALTER TABLE mydb.mytable2 RENAME mydb.mytable;
ANALYZE TABLE mydb.mytable;
This is quite useful for tables that experience a high volume of UPDATEs and DELETEs
Performing this can accomplish two things
Prevent mysql from looking through fragments in a table in an attempt to load data into the right sized fragments. Eliminating these fragments will reduce this operation.
Having the index statistics recomputed helps the MySQL Query Optimizer construct better EXPLAIN plans. Otherwise, queries may deteriorate in execution time because the MySQL Query Optimizer decided to take bad guesses at the EXPLAIN plan. This would be a definite symptom of a table that has had a high volume of UPDATEs and DELETEs.
CAVEAT
With regard to caching, caching takes a dive quickly because of doing a full table scan. For MyISAM index pages flow in and out of the MyISAM Key Cache. For InnoDB, data and index pages flow in and out of the InnoDB Buffer Pool.
Add an index on (tag, hash)
ALTER TABLE tags
DROP INDEX tag,
ADD UNIQUE INDEX tag_hash_UX
(tag, `hash`) ;
Then try this query:
SELECT u.md5, u.url, u.title, u.numsaves
FROM urls AS u
JOIN tags AS t1 ON t1.hash = u.md5
AND t1.tag = 'php'
JOIN tags AS t2 ON t2.hash = u.md5
AND t2.tag = 'tutorials'
ORDER BY u.numsaves DESC
LIMIT 20 ;
and after adding another index:
ALTER TABLE urls
DROP INDEX md5,
ADD INDEX md5_numsaves_IX
(md5, numsaves) ;
try this variation (explanation below, at point 3):
SELECT u.md5, u.url, u.title, u.numsaves
FROM urls AS u
JOIN
( SELECT ui.md5, ui.numsaves
FROM urls AS ui
JOIN tags AS t1 ON t1.hash = ui.md5
AND t1.tag = 'php'
JOIN tags AS t2 ON t2.hash = ui.md5
AND t2.tag = 'tutorials'
ORDER BY ui.numsaves DESC
LIMIT 20
) AS ulim ON ulim.md5 = u.md5
ORDER BY ulim.numsaves DESC ;
Other things you should consider:
- You have 2 identical indexes on
URLS (md5)
, one Primary and one Unique. You don't need both, you can safely delete the redundant (unique) index.
Why was md5
was chosen as the primary key in the first place? Why not a surrogate (auto-incrementing) integer column? That is usually a better choice for primary key in InnODB tables because the primary key is also by default the clustered index of the table.
If you make that change (add an url_id
column and make it PK), you will also need to alter the TAGS
table by adding a url_id
as well and removing the hash
column. This would help efficiency, too. Indexes on int
columns take much less space than indexes on VARCHAR(255)
columns. 4 bytes vs. 757 bytes wide is a lot of difference.
Read this blog post about text and blob columns in InnoDB: Handling long texts/blobs in InnoDB If the issue turns out to be the text columns, you could split the table into two tables, having an 1-1 relationship between them, so text columns are separately from the narrow columns (or try the index I suggested, which is a similar fix for this.)
Best Answer
Information Schema is not like other databases:
source - https://dev.mysql.com/doc/refman/5.7/en/information-schema.html
and not a lot of ways for optimise queries over it there are few recommendations (very standard for any queries)
https://dev.mysql.com/doc/refman/5.7/en/information-schema-optimization.html
recommendations is really standard and in Your case - with count(*) not help.