There's no guarantee in SQL Server if or in which order the statements will be processed in a WHERE
clause. The single expression that allows statement short-circuiting is CASE
-WHEN
. The following is from an answer I posted on Stackoverflow:
How SQL Server short-circuits WHERE condition evaluation
It does when it feels like it, but not in the way you immediately think of.
As a developer you must be aware that SQL Server does not do short-circuiting like it is done in other programming languages and there's nothing you can do to force it to.
For further details check the first link in the above blog entry, which is leading to another blog:
Does SQL Server Short-Circuit?
The final verdict? Well, I don't really have one yet, but it is probably safe to say that the only time you can ensure a specific short-circuit is when you express multiple WHEN conditions in a CASE expression. With standard boolean expressions, the optimizer will move things around as it sees fit based on the tables, indexes and data you are querying.
According to Chapter 9 (Parser and Optimizer), Page 172 of the Book Understanding MySQL Internals by Sasha Pachev
here is the breakdown the evaluation of a query as the following tasks:
- Determine which keys can be used to retrieve the records from tables, and choose the best one for each table.
- For each table, decide whether a table scan is better that reading on a key. If there are a lot of records that match the key value, the advantages of the key are reduced and the table scan becomes faster.
- Determine the order in which tables should be joined when more than one table is present in the query.
- Rewrite the WHERE clauses to eliminate dead code, reducing the unnecessary computations and changing the constraints wherever possible to the open the way for using keys.
- Eliminate unused tables from the join.
- Determine whether keys can be used for
ORDER BY
and GROUP BY
.
- Attempt to simplify subqueries, as well as determine to what extent their results can be cached.
- Merge views (expand the view reference as a macro)
On that same page, it says the following:
In MySQL optimizer terminology, every query is a set of joins. The term join is used here more broadly than in SQL commands. A query on only one table is a degenerate join. While we normally do not think of reading records from one table as a join, the same structures and algorithms used with conventional joins work perfectly to resolve the query with only one table.
EPILOGUE
Because of the keys present, the amount of data, and the expression of the query, MySQL Joins may sometimes do things for our own good (or to get back at us) and come up with results we did not expect and cannot quickly explain.
I wrote about this quirkiness before
because the MySQL Query Optimizer could make dismiss certain keys during the query's evaluation.
@Phil's comment help me see how to post this answer (+1 for @Phil's comment)
@ypercube's comment (+1 for this one too) is a compact version of my post because MySQL's Query Optimizer is primitive. Unfortunately, it has to be since it deals with outside storage engines.
CONCLUSION
As for your actual question, the MySQL Query Optimizer would determine the performance metrics of each query when it is done
- counting rows
- selecting keys
- massaging intermittent results sets
- Oh yeah, doing the actual JOIN
You would probably have to coerce the order of execution by rewriting (refactoring) the query
Here is the first Query you gave
select count(*)
from table1 a
join table2 b
on b.key_col=a.key_col
where b.tag = 'Y';
Try rewriting it to evaluate the WHERE first
select count(*)
from table1 a
join (select key_col from table2 where tag='Y') b
on b.key_col=a.key_col;
That would definitely alter the EXPLAIN plan. It could produce better or worse results.
I once answered a question in StackOverflow where I applied this technique. The EXPLAIN was horrendous but the performance was dynamite. It only worked because of having the correct indexes present and the use of LIMIT in a subquery.
As with stock prices, when it comes to Queries and trying to express them, restrictions apply, results may vary, and past performance is not indicative of future results.
Best Answer
How MySQL processes the query largely (and primarily) depends on the indexes on the table.
If no indexes exist, then of course a full table scan is required. Now, you are looking for a way to reduce the search space based on conditions. However:
LIKE
clause with a prefix '%'The
LIKE
clause, beginning with '%' negates any use of an index, unless it is a covering index, which is not the case in your query. This part of the clause negates any use of an index on thename
column.Well, you can use as many "equals" conditions as you like, where possible, but up to one range condition only, when using an index.
Any index you'll want to put on that table will start with the
condition
column, since it is the one column where you saycondition = 1
, which is an equality check. You are then left to choose any of the two columns on which you place a range condition. Choose the one which will most likely reduce more rows (leaving less rows in the search space).So, you options are:
condition
,qty
)condition
,price
)So that MySQL will first match the
condition
column in the index, then follow up to the next column, whichever it is.MySQL does not read conditions from start to end nor from end to start. It just notes down the various conditions, then tries, when possible, to find an index which satisfies them (or part of them). It then depends on the order of columns in the index -- not on the order by which the columns appear in the query.