Mysql – LEFT JOIN performance for WHERE on secondary table

MySQL

I've been developing for years and through out the carrier I've been using only LEFT JOIN as it provides a more logical sense of mapping my one to many data.

However recently I ran across a performance thought:

Say I have table user with millions of rows and table gender with only 2 rows

user                                gender
=====================               ================
id name      genderId               id gender show
=====================               ================
1  Sam       1                      1  Male   0
2  Samantha  2                      2  Female 1
3  Another millions of records

If I'm doing

select * from user u left join gender g on u.genderId = g.id where g.show = 1;

Does SQL engine joins the whole million user records first, then scan the million rows of records for join projected value?

Or is the engine smart enough to query and join only the rows with correct gender? Or is RIGHT JOIN the correct method to provide best performance?

If the behavior is dependent on engine type, let's talk about MySQL InnoDB.

Best Answer

Remember how SQL works: construct a set of tuples (rows) based on the given tables and joins, then filter those based on any where clause.
Because you have a where condition on the "right" table, your outer join is effectively converted into an inner join.

To retain the "outer-ness" of the join, put the filter condition into the join construct, rather than in the where clause:

select * 
from      user u 
left join gender g 
     on  u.genderId = g.id 
     and g.show = 1 
[where ...]