You don't need all the derived tables. You are joining the basic (product
) too many times. You can write the query joining it only once.
Compound indices are a must for EAV designs. Try adding an index on (attribute_id, product_id, value)
and then the query:
SELECT t0.id,
t1.`value` AS length,
t2.`value` AS height,
t3.`value` AS family
FROM
products t0
INNER JOIN
product_eav_decimal t1
ON t1.product_id = t0.id
AND t1.attribute_id = 91
AND t1.`value` BETWEEN 15 AND 35
LEFT JOIN
product_eav_decimal t2
ON t2.product_id = t0.id
AND t2.attribute_id = 80
--
--
--
LEFT JOIN -- LEFT or INNER join
product_eav_decimal t6
ON t6.product_id = t0.id
-- AND t6.attribute_id =
ORDER BY t0.id ASC ;
In your example query, neither way is particularly better, because the inner queries have their own where clauses, are generating a very small result-set, and are able to do this very efficiently presumably because the invoice
column in pending
and done
are indexed.
If you were doing something more complex, and especially without such highly-selective where clauses in the inter queries, the union would likely perform better... because in the derived table scenario, the derived table is materialized before the joins to the outer tables and indexes on the tables used in the inner query can't be used to handle the join... on the other hand, with UNION
, the optimizer could use indexes on pending/done if they were appropriate.
Personally, I'd be inclined to solve this one by writing one or more stored procedures to encapsulate the union queries and return the answer as a result set using an unbounded SELECT
.
MySQL supports a very useful extension that enables the use of regular SELECT
statements (that is, without using cursors or local variables) inside a stored procedure. The result set of such a query is simply sent directly to the client.
http://dev.mysql.com/doc/refman/5.5/en/stored-routines-syntax.html
These are called "unbounded." So, a stored procedure that takes the invoice id as an input parameter, does the SELECT
... UNION ALL
... SELECT
can encapsulate everything and give you a very clean way to ask the database for what you need and get a record-set in return.
CALL get_invoice_customer_location_detail_by_invoice_id(123456); # long, but much more appealing
If you subsequently decide union or derived really was The One True Way after all, then you only have to update that in one place, in the procedure definition.
Free tip: UNION
means UNION DISTINCT
in non-dinosaur versions of MySQL, while some older versions will interpret it as UNION ALL
. It's probably good practice to explicitly use the one you actually want, so you always get the behavior you expect. It's also generally good from a performance standpoint to use UNION ALL
unless you need UNION DISTINCT
because it's one less thing for the optimizer to have to deal with, but this is much more the case with large result sets, where a lot of time could be spent de-duplicating a set of data that is already free of duplicates.
Best Answer
This is tricky question, specially since 5.6, in which the optimizer has changed the way subqueries are executed. Specially, having into account that the definition of "derived table" may change person to person
From 4.1 to 5.5, in many cases, either a temporary table would be created or a cross join will be done between tables, filtering later. In some cases, like the infamous
SELECT ... IN (SELECT ...)
a derived table will not be created, and it will be executed as a correlated subquery (one row at a time). See the examples here: http://forums.mysql.com/read.php?115,576862,576999#msg-576999On that same thread, Roy Liseng comment the great subquery improvements of 5.6: http://forums.mysql.com/read.php?115,576862,577215#msg-577215 You can see a summary of the improvements on the 5.6 subquery optimization on the manual, including the creation of indexes on materialization, or its conversion into semi-joins. More details on some of the Oracle engineer blogs.
I think that I had to mention this because there are a lot of things going on on the optimizer, and not always a temporary table is created first "as is".
Regarding the other question, MySQL always tries to create internal temporary tables on MEMORY first, but failsback to disk if they are too large (larger than
min(tmp_table_size, max_heap_table_size)
or some other restrictions (like havingBLOBS
).