LEFT JOIN conversion to INNER JOIN, can’t change FROM clause

join;oracletimestamp

I have 2 tables:
BASE is master and is linked with CHILD via field ID: Left outer Join / 1:N relation.
These 2 tables contain historised data, defined by 2 date-fields: start_date and end_date.
However the history in BASE is independent from the history in CHILD. E.g.:

BASE
1   1/1/2000    1/1/2010
1   1/1/2010    31/12/2012

CHILD
1   1/1/2000    31/12/2005
1   31/12/2005  1/1/2010

The goal:
For a given date, select the active record in both table and show them in 1 result-record.

Problem:
If there is no match in CHILD (e.g. 1/1/2011), I get “no rows” whereas it should show the data from BASE (and Null-values for the CHILD-fields)

This is my query:

SELECT
  BASE.*, CHILD.*
FROM
  BASE LEFT OUTER JOIN CHILD ON (BASE.ID=CHILD.ID)
WHERE
  BASE.ID  = '1'
  AND BASE.START_DATE  <=  TO_DATE('1/1/2011', 'DD/MM/YYYY')
  AND BASE.END_DATE  >  TO_DATE('1/1/2011', 'DD/MM/YYYY')
  AND CHILD.START_DATE  <=  TO_DATE('1/1/2011', 'DD/MM/YYYY')
  AND CHILD.END_DATE  >  TO_DATE('1/1/2011', 'DD/MM/YYYY')

I know that my LEFT JOIN will be converted to an INNER JOIN because of CHILD – columns in the WHERE clause. I also tried the IS NULL-trick that I found on this forum, but I’m still struggling with it.

Important remark:
Because of the technical setup, it is not possible to change the FROM clause, so I’m forced to find a solution via the WHERE clause …

Can you help?

Best Answer

That one caused me quite a bit of headache before I found a halfway elegant solution:

WITH x AS (
    SELECT BASE.id AS b_id
          ,BASE.start_date AS b_start_date
          ,BASE.end_date AS b_end_date
          ,CHILD.id AS c_id
          ,CHILD.start_date AS c_start_date
          ,CHILD.end_date AS c_end_date
          ,CASE WHEN DATE '2011-1-1' BETWEEN CHILD.start_date AND CHILD.end_date
                                     OR CHILD.id IS NULL THEN 0 ELSE 1 END AS c_hit
    FROM   BASE LEFT OUTER JOIN CHILD ON (BASE.ID=CHILD.ID)
    WHERE  BASE.ID = 1
    AND    DATE '2011-1-1' BETWEEN BASE.start_date AND BASE.end_date
    )
SELECT b_id, b_start_date, b_end_date
      ,c_id, c_start_date, c_end_date
FROM   x
WHERE  x.c_hit = 0

UNION ALL
SELECT b_id, b_start_date, b_end_date
      ,NULL, NULL,         NULL
FROM   x
GROUP  BY b_id, b_start_date, b_end_date
HAVING min(c_hit) = 1;

Of course, the better solution would be to get rid of the restrictions and adjust the JOIN in the base query. As @a_horse expressed it so poetically in the comments above:

the perils of ORM frameworks

Anyway, to solve the question at hand:
After selecting rows where BASE.id and BASE.start_date / BASE.end_date match, we have four different cases:

  1. No child at all, so CHILD.id is NULL.
  2. Child matches.
  3. Child does not match and no other matching child.
  4. Child does not match but other matching child.

This is hard to solve on one query level. You could do it with window functions and CASE statements and DISTINCT, in fact I had a basic working query, but it gets monstrous.

Using a CTE dramatically simplifies the syntax.
In the CTE:

  • Filter BASE on id and date range. Still includes rows with non-matching children.
  • Assign aliases for identical column names
  • Tag rows where child matches the date

In the final SELECT:

  • Return all rows with matching children
  • Return one row where no matching children exist, with NULLs for child-columns.

Here is working demo on sqlfiddle.com for those who want to toy with it.