Mysql – Why is the subquery apparently correlated? How to avoid it

MySQLsubquery

I have a table orders and a table orderpositions. The orderpositions (basically the items bought with the order) are linked via orderid (indexed) to the orders primary key.

For a given set of orders I need the orderpositions. Because the conditions with which the orders are determined are dynamic I need to use a subquery instead of a JOIN, so I wrote:

SELECT *
FROM orderpositions
WHERE orderpositions.orderid IN
(
    SELECT id FROM orders WHERE userid=118
);

(The sub-query can be something entirely different, perhaps with nested JOINS and whatnot, but will always return a list of order IDs, of which I need the linked orderpositions.)

Now this runs significantly slower than the JOIN equivalent (which is trivial in this example, but not in others). I have read about correlated queries and believe that my approach is very slow because the subquery is executed once for each row in the orderpositions table.

As a test I have tried to create a temporary table, insert the order IDs into it and then JOIN on that, which is also very fast. But I would like to avoid the overhead of the temporary table and firing off multiple queries.

Is there any way to force the sub-query to be uncorrelated?

Best Answer

Your suppositions are correct, MySQL versions up to 5.5 only can execute those kind of subqueries (semi-joins) in that particular way. The subquery is not correlated in reality (there is no real dependency), but the DEPENDENT SUBQUERY that you may get from explain means that it is executing it as if it was one.

Please note that in some scenarios, that execution plan could be the most optimal one (checking the existence of the rows one by one), but in many others, like yours, it will probably not. This is a very well known query plan limitation of MySQL <= 5.5, as you can see on these slides I created about query optimization:

You can either upgrade to 5.6 (or use >= MariaDB 5.3) where there are more options available for its execution; or manually convert the query to a JOIN or a scalar subquery if you want to stick with MySQL <=5.5. The 5.6 and the current in-development 5.7 contain a great number of optimizations that make the life easier for the developer.