MySQL: Does the order of the ON condition of a LEFT JOIN matter

join;MySQL

I have two tables, employees and unionMembers. The people in unionMembers are a small subset of the people in employees. I want to do a SELECT query using a LEFT JOIN like so:

SELECT    wage, unionDues
FROM      employees 
LEFT JOIN unionMembers ON employees.ID = unionMembers.ID 
WHERE     employees.ID = 'ID001';

Does it matter if I do ON employees.ID = unionMembers.ID or reverse the order and do ON unionMembers.ID = employees.ID?

Let's say I have 100 employees and 5 of them are union members. When doing the LEFT JOIN, ideally it would have to do 5 comparisons instead of a 100. Would the order of the equality test make any difference?

Best Answer

SQL is a declarative language; that is, it does not specify how something should be done, only what the result should be.

The computing languages I have used all respect the properties of equality, so employees.ID = unionMembers.ID is equivalent to unionMembers.ID = employees.ID

In this case, rather than an outer join, which can generate a large data-set, you would probably want to use a sub-query or the EXISTS operator, depending on your data structure. Something like:

SELECT ID, wage, unionDues FROM employees WHERE ID in ( SELECT ID FROM unionMembers WHERE ID = 'ID001' );

There is an in-depth discussion here: https://dev.mysql.com/doc/refman/8.0/en/subquery-optimization-with-exists.html

HTH

Related Solutions

Mysql – Left join not working like expected

If you run the statement without the where clause you'll see why:

lang | id    | lang | id   
-----+-------+------+------
en   | hello | en   | hello
en   | hello | fr   | hello
fr   | hello | en   | hello
fr   | hello | fr   | hello
en   | world | en   | world
en   | world | de   | world
de   | world | en   | world
de   | world | de   | world

The join on the "id" column works like this:

Take the first hello from the table and look for all rows that contain hello - that yield two rows for the first hello. The same happens with the second hello, so you wind up with 2x2 rows for the join on hello. And the same for world

The outer join does not play any role, because there is a match for each id (actually: two matches).

You can never get your first (intended) result because that implies that all rows in the "a" table have lang='en' (which is of course not true).

To get the missing translations you need to first create the combination of all languages and ids:

select distinct a.id, b.lang
from trans a
  cross join trans b;

Now you need to find all rows that are not in that result:

select *
from ( 
  select distinct a.id, b.lang
  from trans a
    cross join trans b
) ac
where not exists (select 1
                  from trans mt
                  where mt.id = ac.id 
                    and mt.lang = ac.lang);

You can achieve this with an outer join as well. I simply prefer the not exists because it documents more clearly the intention (and because I hardly ever work with MySQL which is known to perform poorly with sub-queries like that)

select ac.*
from ( 
  select distinct a.id, b.lang
  from trans a
    cross join trans b
) ac
  left join trans mt on mt.id = ac.id and mt.lang = ac.lang
where mt.id is null;

Here is an SQLFiddle: http://sqlfiddle.com/#!2/9804d/6

Edit

after testing the peformance with larger tables, it seems that Sean's version of the cross join is much more efficient than mine.

So this statement should be faster than the ones above:

select at.*
from (
    select lang_code, label_code
    from (
        SELECT distinct lang_code 
        FROM translations
    ) as translang
      cross join (
         SELECT distinct label_code 
         FROM translations
      ) as transid
) at
  left join translations mt 
         on mt.lang_code = at.lang_code
        and mt.label_code = at.label_code
where mt.lang_code is null;

Edit 2

And another version to be tested (SQL-Fiddle):

SELECT a.lang, a.id,
       l.lang AS blang
FROM trans a 
  CROSS JOIN 
    ( SELECT DISTINCT lang
      FROM trans
    ) l
  LEFT JOIN trans b 
    ON  b.id = a.id
    AND b.lang = l.lang 
WHERE a.lang = 'en'
  AND b.id IS NULL ;

Mysql – Does using LIMIT improve the performance and is it noticeable

If you want to take advantage of LIMIT to improve performance, you need

understand the data you are retrieving
proper indexing the correct sequence of columns
take responsibility for refactoring the query
using LIMIT before JOIN

These principles can go a long way if you can orchestrate them.

I learned these concepts by watching this YouTube Video (listen carefully through the French accent)

I used those concepts to answer a very tough StackOverflow question about getting the top 40 articles from some tables : May 12, 2011 : Fetching a Single Row from Join Table.

In my answer to that question (May 16, 2011), I wrote the following query and tested it thoroughly:

SELECT
  AAA.author_id,
  AAA.date_created,
  IFNULL(BBB.title,'<NO_TITLE>') title,
  IFNULL(CCC.filename,'<NO-IMAGE>') filename,
  IFNULL(CCC.date_added,'<NO-IMAGE-DATE>') image_date
FROM
(
  SELECT
    AA.id,
    AA.date_added,
    BB.author_id,
    BB.date_created
  FROM
  (
    SELECT
      A.id,IFNULL(MAX(B.date_added),'1900-01-01 00:00:00') date_added
      FROM (SELECT id FROM articles ORDER BY date_created DESC LIMIT 40) A
      LEFT JOIN article_images B ON A.id = B.article_id
      GROUP BY A.id
  ) AA
  INNER JOIN articles BB USING (id)
) AAA
LEFT JOIN article_contents BBB ON AAA.id=BBB.article_id
LEFT JOIN article_images CCC
ON (AAA.id=CCC.article_id AND AAA.date_added=CCC.date_added)
ORDER BY AAA.date_created DESC;

Please notice the line in the query with the LIMIT

      FROM (SELECT id FROM articles ORDER BY date_created DESC LIMIT 40) A

This subquery is buried three levels deep. This allowed me to get the last 40 articles using LIMIT. Then, I performed the necessary JOINs afterwards.

LESSONS LEARNED

Doing LIMIT inside subqueries may not always be the answer because of the cardinality of indexes, the data content, and the result set size from the LIMIT. If you have all your "ducks in a row" (Have the four principles in mind for your query), you can get surprisingly good results.
Make your queries as simplistic as possible when doing LIMIT by gathering keys only.

Best Answer

Related Solutions

Mysql – Left join not working like expected

Mysql – Does using LIMIT improve the performance and is it noticeable

LESSONS LEARNED

Related Question