Mysql – Outer join on a view with calculated columns performs calculation for missing (“outer”) records

coalescejoin;MySQLmysql-5.5view

I have a view in MySQL, which has some calculated fields — COALESCE and the like. If I do a left outer join on that view (so records from that view may be missing), then the calculated fields are still calculated for the missing records, instead of showing up as NULL.

Example

I've tested the following example on MySQL versions 5.5.40-36.1-log and 5.5.53-38.5 (because that's what I've got on hand) as well as on SQL Fiddle versions 5.5 and 5.6.

Setup

-- A simple view
CREATE OR REPLACE VIEW foo
AS
      SELECT 1 AS x
UNION SELECT 2
UNION SELECT 3;

-- Another simple view
CREATE OR REPLACE VIEW bar
AS
      SELECT 1 AS y
UNION SELECT 2
UNION SELECT 3;

-- A (contrived) view with a calculated column
CREATE OR REPLACE VIEW baz
AS
SELECT
   f.x,
   b.y,
   SIGN(COALESCE(b.y, 0)) AS z
FROM foo f
LEFT JOIN bar b ON f.x = 2 * b.y;

Query

-- A query that does a left outer join
-- on the view with the calculated column
SELECT *
FROM foo f
LEFT JOIN baz b ON f.x = b.y;

Expected result

I would expect the following result, where we see one row with records from baz and two rows without.

| x | x | y | z |
-----------------
| 1 | 2 | 1 | 1 |
| 2 | - | - | - |
| 3 | - | - | - |

Actual result

But instead I get the following, with indeed one row with records from baz, but two rows where there are no records from baz except for the calculated column, which is calculated for all three rows!

| x | x | y | z |
-----------------
| 1 | 2 | 1 | 1 |
| 2 | - | - | 0 |
| 3 | - | - | 0 |

So z is calculated, even where baz returns no records (as evidenced by the NULLs for x and y).

My actual case involves a join between the view and a table the view is based on. I've replicated this in the example, using views only. I've also tested it without the "self-join"; this makes no difference.

Workaround

What does work for me, is hiding baz in a sub-select, like so.

SELECT *
FROM foo f
LEFT JOIN (
   SELECT *
   FROM baz
) b ON f.x = b.y;

This yields the expected result:

| x | x | y | z |
-----------------
| 1 | 2 | 1 | 1 |
| 2 | - | - | - |
| 3 | - | - | - |

Is this expected behaviour? Why?


‡: worked for me yesterday, can't get it to work today.

Best Answer

If the two queries:

SELECT * 
FROM foo f 
  LEFT JOIN baz b 
  ON f.x = b.y; 

and

SELECT * 
FROM foo f 
  LEFT JOIN (SELECT * FROM baz) b 
  ON f.x = b.y; 

return different results, then it's a bug and the table and view definitions are irrelevant. That should never happen.

Before you go and post a bug report, check your view and queries in the version (5.5.54). Your 5.5.40 is 14 minor versions and about 2.5 years older than the latest 5.5 version.

If the bug still persists, then file a bug report to MySQL with this example. A similar bug has already been reported: Return inconsistent result when using inline SELECT vs VIEW for left join, so if you report I suggest you add your examples there.

MySQL developers have comfirmed that one and suggested a few workarounds (until it is fixed). You can try the suggestion to create the views with ALGORITHM = TEMPTABLE:

There are a few workarounds:

In all version, the query using the view will give correct result if view is created with "CREATE ALGORITHM=TEMPTABLE VIEW ..."

For 5.7/8.0, both queries will give the correct result by setting optimizer_switch='derived_merge=off'

In 8.0 the NO_MERGE hint can be used to get correct behavior in both cases: SELECT /*+ NO_MERGE(t2) */ ...


Having said that, the query is more complex than needed. You could have used something simpler (although I don't like having a result with columns identically named):

CREATE VIEW baz2
AS
SELECT
   f2.x AS x
   f1.x AS x,
   CASE WHEN f1.x IS NOT NULL THEN f2.x END AS y,
   SIGN(f1.x) AS z
FROM foo f2                   -- changed:   f2 LEFT JOIN f1
  LEFT JOIN bar f1
  ON f1.x = 2 * f2.x;

and

SELECT * FROM baz2 ;