MySQL: Why is this UPDATE query so incredibly slow

mariadbMySQLperformancequery-performance

So I have a landing table for my data called XMLImport, and a products table called ActiveProducts. When there are changes to a product in the XMLImport table I need it to update the ActiveProducts table, but only update the records which are needed and keep track of which ones.

This is the code I am using to find changes :

SELECT DISTINCT sku FROM (
  SELECT name,size,colour,sku,quantity,price FROM ActiveProducts
  UNION ALL
  SELECT name,size,colour,sku,quantity,price FROM XMLImport
) tbl
GROUP BY name,size,colour,sku,quantity,price
HAVING count(*) = 1
ORDER BY name,size,colour,sku,quantity,price

This is the code I am using to update :

UPDATE ActiveProducts ap INNER JOIN XMLImport xml ON ap.sku = xml.sku
SET
  ap.name = xml.name,
  ap.size = xml.size,
  ap.colour = xml.colour,
  ap.instock = xml.instock,
  ap.quantity = xml.quantity,
  ap.price = xml.price,
  ap.updated = 1

When the two pieces of code are run separately they complete in less than a second each. However when combined like below, the query runs for HOURS !

UPDATE ActiveProducts ap INNER JOIN XMLImport xml ON ap.sku = xml.sku
SET
  ap.name = xml.name,
  ap.size = xml.size,
  ap.colour = xml.colour,
  ap.instock = xml.instock,
  ap.quantity = xml.quantity,
  ap.price = xml.price,
  ap.updated = 1
WHERE ap.sku IN (
SELECT DISTINCT sku FROM (
  SELECT name,size,colour,sku,quantity,price FROM ActiveProducts
  UNION ALL
  SELECT name,size,colour,sku,quantity,price FROM XMLImport
) tbl
GROUP BY name,size,colour,sku,quantity,price
HAVING count(*) = 1
ORDER BY name,size,colour,sku,quantity,price
);

Below is the structure

ActiveProducts.Product_id - PrimeKey
ActiveProducts.name - Index
ActiveProducts.sku - Index

XMLImport.Product_id - PrimeKey
XMLImport.name - Index
XMLImport.sku - Index

Server version: 5.5.44-MariaDB MariaDB Server

Tables: InnoDB

Best Answer

Why don't you just build a temp table and use the results to update. This is bad form. It's got to build the Subquery first, then assemble that in the WHERE Clause, then it's going to update every record to find a match.

Instead just INNER JOIN to a temp table that does the first part, or a CTE if you're using SQL Server and can't have a multi-step process.

WITH SkuList AS
(
SELECT DISTINCT sku FROM (
  SELECT name,size,colour,sku,quantity,price FROM ActiveProducts
  UNION ALL
  SELECT name,size,colour,sku,quantity,price FROM XMLImport
) tbl
GROUP BY name,size,colour,sku,quantity,price
HAVING count(*) = 1
ORDER BY name,size,colour,sku,quantity,price
)
UPDATE ActiveProducts ap INNER JOIN SkuList xml ON ap.sku = xml.sku
SET
  ap.name = xml.name,
  ap.size = xml.size,
  ap.colour = xml.colour,
  ap.instock = xml.instock,
  ap.quantity = xml.quantity,
  ap.price = xml.price,
  ap.updated = 1