MySQL slow select query

MySQLperformancequery-performanceselect

Have this query that is awfully slow:

SELECT bdp.id 
FROM bd_products bdp JOIN bd_productwords bdpw
ON bdp.id = bdpw.productid JOIN bd_words bdw
ON bdpw.wordid = bdw.id
WHERE bdp.price > 0
AND bdw.word = 'nike' 
ORDER BY bdpw.productid DESC

It returns 18217 rows and takes around 1,4 seconds to run 🙁

Removing the ORDER BY doesn't change the execution time. It stays around 1,4 seconds.

But if I remove the bd_products part like below, execution time improves to just 0,005 seconds.

SELECT DISTINCT bdpw.productid
FROM bd_productwords bdpw JOIN bd_words bdw
ON bdpw.wordid = bdw.id
WHERE bdw.word = 'nike'

It's not the bdp.price part that's the problem. Removing it like below doesn't change the execution time. It stays around 1,4 seconds.

SELECT bdp.id 
FROM bd_products bdp JOIN bd_productwords bdpw
ON bdp.id = bdpw.productid JOIN bd_words bdw
ON bdpw.wordid = bdw.id
WHERE bdw.word = 'nike' 
ORDER BY bdpw.productid DESC
LIMIT 100

The table bd_products contain 2,0 million rows. bd_productwords contain 16,1 million rows and bd_words contain 0,4 million rows.

Any ideas for why execution time is so slow when I include the bd_products part?


Table columns:

bd_words: id (int), word (varchar)
bd_productwords: id (int), wordid (int), productid (int)
bd_products: id (int), name (varchar), description (varchar), price (int)

Indexes for bd_products:

Table   Non_unique  Key_name    Seq_in_index    Column_name Collation   Cardinality Sub_part    Packed  Null    Index_type  Comment Index_comment
bd_products 0   PRIMARY 1   id  A   1837288 \N  \N      BTREE       
bd_products 1   Price   1   price   A   55675   \N  \N  YES BTREE       
bd_products 1   Imageuploaded   1   imageuploaded   A   2   \N  \N  YES BTREE       
bd_products 1   BrandId 1   brand   A   9669    \N  \N  YES BTREE       
bd_products 1   BrandId 2   id  A   1837288 \N  \N      BTREE       
bd_products 1   Id  1   id  A   1837288 \N  \N      BTREE       

Indexes for bd_productwords:

Table   Non_unique  Key_name    Seq_in_index    Column_name Collation   Cardinality Sub_part    Packed  Null    Index_type  Comment Index_comment
bd_productwords 0   PRIMARY 1   id  A   15757529    \N  \N      BTREE       
bd_productwords 1   WordidProductid 1   wordid  A   1313127 \N  \N  YES BTREE       
bd_productwords 1   WordidProductid 2   productid   A   15757529    \N  \N  YES BTREE       
bd_productwords 1   ProductidWordid 1   productid   A   5252509 \N  \N  YES BTREE       
bd_productwords 1   ProductidWordid 2   wordid  A   15757529    \N  \N  YES BTREE       
bd_productwords 1   WordId  1   wordid  A   1125537 \N  \N  YES BTREE       
bd_productwords 1   ProductId   1   productid   A   5252509 \N  \N  YES BTREE

Indexes for bd_words:

Table   Non_unique  Key_name    Seq_in_index    Column_name Collation   Cardinality Sub_part    Packed  Null    Index_type  Comment Index_comment
bd_words    0   PRIMARY 1   id  A   395359  \N  \N      BTREE       
bd_words    1   Word    1   word    A   395359  \N  \N  YES BTREE       

Explain for the awfully slow query:

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  bdw ref PRIMARY,Word    Word    603 const   1   Using where; Using index; Using temporary; Using filesort
1   SIMPLE  bdpw    ref WordidProductid,ProductidWordid,WordId,ProductId    WordidProductid 5   dreammodels local.bdw.id    12  Using where; Using index
1   SIMPLE  bdp eq_ref  PRIMARY,Price,Id    PRIMARY 4   dreammodels local.bdpw.productid    1   Using where

Best Answer

SHOW CREATE TABLE is more descriptive than DESCRIBE. And easier to read.

(I am removing the "bp", which I find as being distracting clutter:)

SELECT  p.id
    FROM  words w
    JOIN  productwords pw  ON pw.wordid = w.id
    JOIN  products p  ON p.id = pw.productid
    WHERE  w.word = 'nike'
      AND  p.price > 0
    ORDER BY  pw.productid DESC
    LIMIT  100

If the tables are not InnoDB, convert to such.

productwords seems to be a many:many mapping table; see here for advice on improving its performance .

18K rows after a moderately complex query could reasonably take 1.4 seconds. Some more things to ask:

  • How much RAM?
  • Value of innodb_buffer_pool_size?
  • Disk subsystem (HDD/SSD, etc)?
  • What will you do with 18K ids or even 100? Turn around and ask for more data from some table? That will be inefficient.

The complexity prevents scanning fewer than 18K rows, even with the LIMIT 100. OTOH, if you get rid of all references to products would probably speed it up some.