Recently, I started working on a project, where I came across many MySQL statements like this:
SELECT s.Important, c.Interesting
FROM table1 s, table2 c
WHERE s.IdForTableC = c.id
or
SELECT q.Important, c.Interesting
FROM table1 q, table2 p, table3 c
WHERE q.IdInTableP = p.id
AND p.IdInTableC = c.id
AND c.Measure = 1
I am completely new to optimizing MySQL and read a few guides that advised me to use EXPLAIN SELECT
. I did and get type all
for the above query and type index
for the second one. I guess this means, that both queries should be optimized, but I don't know how. Is it even possible without changing the whole structure of the database?
EDIT: the EXPLAIN output of the first one is:
'1', 'SIMPLE', 'c', 'ALL', 'PRIMARY', NULL, NULL, NULL, '467', ''
'1', 'SIMPLE', 's', 'ref', 'IdInTableC', 'IdInTableC', '5', 'table2.c.id', '2', 'Using where'
and INDEXES are:
'table2', '0', 'PRIMARY', '1', 'id', 'A', '14184441', NULL, NULL, '', 'BTREE', '', ''
'table2', '1', 'otherID', '1', 'otherID', 'A', '92708', NULL, NULL, '', 'BTREE', '', ''
'table2', '1', 'IdInTableC', '1', 'IdInTableC', 'A', '3048', NULL, NULL, '', 'BTREE', '', ''
Best Answer
Query 1
EXPLAIN
is not simple to follow. First, let me explain how to optimize the two queries.That's the old way to write a
JOIN
; please learn the new way:The
ON
explains how the two table relate to each other.Now there is no filtering via a
WHERE
clause, so all that can be done isIf
s
is used for step 1, then an index onid
is needed inc
for step 2.If
c
is used for step 1, then an index onIdForTableC
is needed ins
for step 2.The convention is to name the
PRIMARY KEY
"id
". A table must have aPRIMARY KEY
. I'll guess thatc
hasPRIMARY KEY(id)
. This makes starting withs
a slam-dunk. But, instead, the Optimizer decided to start withc
-- as seen by it being row 1 in theEXPLAIN
."ALL" means that the entire table is scanned. Some of the NULLs go along with that. "PRIMARY" does not mean much in this context.
c
has about 467 rows.To get into
s
, it usedtable2.c.id
. Something is wrong there. That syntax says "database table2, table c, column id". Is your database called "table2". Ugh.The index named
IdInTableC
was used. Presumably, you hadso that name is the same as the column.
5
probably means it is aNULLable
INT
. UseNOT NULL
unless you have a purpose forNULL
. TheJOIN
will find an estimated 2 rows ins
.I guessed wrong as to which order
c
ands
would be used. But the symmetry of the query implies that there might not be a lot of difference in which to start with.Query 2
Again, let's use the new
JOIN
syntax:It is usually best to "filter" first. The only filter is
c.Measure = 1
, soc
will probably be the first table, especially if you haveThen, the chain must be: c -> p -> q. So these are needed:
With those indexes, all three tables should use
INDEX
, perhaps withref
oreq_ref
.More Tips
In the Extras column:
ORDER BY
implies a sort. The absence of "filesort" implies an optimization that is sometimes available.For some basic index-creation tips, see http://mysql.rjweb.org/doc.php/index_cookbook_mysql
I don't use
EXPLAIN
that much; I try to go straight from theSELECT
to the optimal index.