MySQL – Optimizing Queries Without Index Hints

indexMySQL

Here's an excerpt of this article:

SELECT id, name, address, phone 
FROM customers 
ORDER BY name 
LIMIT 10 OFFSET 990;

MySQL is first scanning an index then retrieving rows in the table by
primary key id. So it’s doing double lookups and so forth.

The following piece just uses the primary key:

SELECT id
FROM customers
ORDER BY name
LIMIT 10 OFFSET 990;

I don't figure out the difference between those two queries despite the explanation, especially the evoked double lookup…

May someone explain it in more detail?

Best Answer

The table has a secondary index on name(let's call it idx_name for the reference). Internally MySQL stores records in secondary indexes as a pair of (key, value), where key is indexed field and value is the primary key. In this case it will be (name, id).

To execute the first query MySQL decided to use index idx_name. But in order to satisfy the query it has to take id for each value of name and go to PRIMARY index in order to get address and phone values. That's why they call it "additional lookup".

For the second query MySQL has all necessary fields in index idx_name. Remember, id is the part of the index?

Related Solutions

Mysql – How to optimize this MySQL query further

After looking over the query, the tables, and the WHERE AND GROUP BY clauses, I recommend the following:

Recommendation #1) Refactor the Query

I reorganized the query to do three(3) things:

create smaller temp tables
Process the WHERE clause on those temp tables
Delay joining to the very last

Here is my proposed query:

SELECT
  sounds.*,srkeys.avg_rating,srkeys.votes
FROM
(
  SELECT AA.id,avg(BB.rating) AS avg_rating, count(BB.rating) AS votes
  (
    SELECT id FROM sounds
    WHERE blacklisted = false 
    AND   ready_for_deployment = true 
    AND   deployed = true 
    AND   type = "Sound" 
    AND   created_at > '2011-03-26 21:25:49'
  ) AA INNER JOIN
  (
    SELECT AAA.ratings,AAA.rateable_id
    FROM ratings AAA
    WHERE rateable_type = 'Sound'
  ) BB
  ON AA.id = BB.rateable_id
  GROUP BY BB.rateable_id
) srkeys INNER JOIN sounds USING (id);

Recommendation #2) Index the sounds table with an index that will accommodate the WHERE clause

The columns of this index include all the columns from the WHERE clause with static values first and moving target last

ALTER TABLE sounds ADD INDEX support_index
(blacklisted,ready_for_deployment,deployed,type,created_at);

I sincerely believe you will be pleasantly surprised. Give it a Try !!!

UPDATE 2011-05-21 19:04

I just saw the cardinality. OUCH !!! Cardinality of 1 for rateable_id. Boy, I feel stupid !!!

UPDATE 2011-05-21 19:20

Maybe making the index will be enough to improve things.

UPDATE 2011-05-21 22:56

Please run this:

EXPLAIN SELECT
  sounds.*,srkeys.avg_rating,srkeys.votes
FROM
(
  SELECT AA.id,avg(BB.rating) AS avg_rating, count(BB.rating) AS votes FROM
  (
    SELECT id FROM sounds
    WHERE blacklisted = false 
    AND   ready_for_deployment = true 
    AND   deployed = true 
    AND   type = "Sound" 
    AND   created_at > '2011-03-26 21:25:49'
  ) AA INNER JOIN
  (
    SELECT AAA.ratings,AAA.rateable_id
    FROM ratings AAA
    WHERE rateable_type = 'Sound'
  ) BB
  ON AA.id = BB.rateable_id
  GROUP BY BB.rateable_id
) srkeys INNER JOIN sounds USING (id);

UPDATE 2011-05-21 23:34

I refactored it again. Try This One Please:

EXPLAIN
  SELECT AA.id,avg(BB.rating) AS avg_rating, count(BB.rating) AS votes FROM
  (
    SELECT id FROM sounds
    WHERE blacklisted = false 
    AND   ready_for_deployment = true 
    AND   deployed = true 
    AND   type = "Sound" 
    AND   created_at > '2011-03-26 21:25:49'
  ) AA INNER JOIN
  (
    SELECT AAA.ratings,AAA.rateable_id
    FROM ratings AAA
    WHERE rateable_type = 'Sound'
  ) BB
  ON AA.id = BB.rateable_id
  GROUP BY BB.rateable_id
;

UPDATE 2011-05-21 23:55

I refactored it again. Try This One Please (Last Time):

EXPLAIN
  SELECT A.id,avg(B.rating) AS avg_rating, count(B.rating) AS votes FROM
  (
    SELECT BB.* FROM
    (
      SELECT id FROM sounds
      WHERE blacklisted = false 
      AND   ready_for_deployment = true 
      AND   deployed = true 
      AND   type = "Sound" 
      AND   created_at > '2011-03-26 21:25:49'
    ) AA INNER JOIN sounds BB USING (id)
  ) A INNER JOIN
  (
    SELECT AAA.ratings,AAA.rateable_id
    FROM ratings AAA
    WHERE rateable_type = 'Sound'
  ) B
  ON A.id = B.rateable_id
  GROUP BY B.rateable_id;

UPDATE 2011-05-22 00:12

I hate giving up !!!!

EXPLAIN
  SELECT A.*,avg(B.rating) AS avg_rating, count(B.rating) AS votes FROM
  (
    SELECT BB.* FROM
    (
      SELECT id FROM sounds
      WHERE blacklisted = false 
      AND   ready_for_deployment = true 
      AND   deployed = true 
      AND   type = "Sound" 
      AND   created_at > '2011-03-26 21:25:49'
    ) AA INNER JOIN sounds BB USING (id)
  ) A,
  (
    SELECT AAA.ratings,AAA.rateable_id
    FROM ratings AAA
    WHERE rateable_type = 'Sound'
    AND AAA.rateable_id = A.id
  ) B
  GROUP BY B.rateable_id;

UPDATE 2011-05-22 07:51

It has been bothering me that ratings is coming back with 2 million rows in the EXPLAIN. Then, it hit me. You might need another index on the ratings table which starts with rateable_type:

ALTER TABLE ratings ADD INDEX
rateable_type_rateable_id_ndx (rateable_type,rateable_id);

The goal of this index is to reduce the temp table that manipulates ratings so that it is less that 2 million. If we can get that temp table significantly smaller (at least half), then we can have a better hope in your query and mine working faster too.

After making that index, please Retry my original proposed query and also try yours:

SELECT
  sounds.*,srkeys.avg_rating,srkeys.votes
FROM
(
  SELECT AA.id,avg(BB.rating) AS avg_rating, count(BB.rating) AS votes
  (
    SELECT id FROM sounds
    WHERE blacklisted = false 
    AND   ready_for_deployment = true 
    AND   deployed = true 
    AND   type = "Sound" 
    AND   created_at > '2011-03-26 21:25:49'
  ) AA INNER JOIN
  (
    SELECT AAA.ratings,AAA.rateable_id
    FROM ratings AAA
    WHERE rateable_type = 'Sound'
  ) BB
  ON AA.id = BB.rateable_id
  GROUP BY BB.rateable_id
) srkeys INNER JOIN sounds USING (id);

UPDATE 2011-05-22 18:39 : FINAL WORDS

I had refactored a query in a stored procedure and added an index to help answer a question on speeding things up. I got 6 upvotes, had the answer accepted,and picked up a 200 bounty.

I had also refactored another query (marginal results) and added an index (dramatic results). I got 2 upvotes and had the answer accepted.

I added an index for yet another query challange and was upvoted once

and now your question.

Wanting to answers all questions like these (including yours) were inspired by a YouTube video I watched on refactoring queries.

Thank you again, @coneybeare !!! I wanted to answer this question to fullest extent possible, not just accept points or accolades. Now, I can feel that I earned the points !!!

MySQL – How to Handle Records Turnover

Not tested, but the idea is using same query twice with different LIMITs (depending on %HOURS passed) in a UNION.

(
SELECT * FROM villa_table v
 ORDER BY villa_order ASC, v.ID
 LIMIT %HOURS, 999999999999
) UNION ALL (
SELECT * FROM villa_table v
 ORDER BY villa_order ASC, v.ID
 LIMIT 0, %HOURS
)

You'll need to fill in %HOURS in your script language or stored procedure. Also once %HOURS is larger than the COUNT(*) of villa_table you'll need to restart it from 0.

Note how the parentheses are necessary.

Also note that the ORDER BY fields must uniquely identify rows (i.e. append the PRIMARY KEY!) to prevent possible ambiguous sorting.

Best Answer

Related Solutions

Mysql – How to optimize this MySQL query further

MySQL – How to Handle Records Turnover

Related Question