MySQL: LEFT OUTER JOIN within reason

join;MySQL

What I'm attempting to do is LEFT OUTER JOIN a table, but I only want the last inserted record of the joined table to affect the results. Currently each table row within the joined table seems to be affecting the results of my query.

My current LEFT OUTER JOIN is similar to:

LEFT OUTER JOIN tablethree AS t3 ON t3.sid = t2.sid

Inside of tablethree I have two different results. Both of them are appearing to affect my results, but what I would like is for the most recently entered result to affect my end results.

My full query looks similar to (only using * here to save time):

SELECT t1.*
FROM tableone AS t1
INNER JOIN tabletwo AS t2
  ON t1.cid = t2.id
LEFT OUTER JOIN tablethree AS t3
  ON t3.sid = t2.sid
WHERE t1.fieldone = 1 
  && t1.odate NOT BETWEEN t3.startdate AND t3.enddate

The tablethree start date and end date that I want reflect 03-01-2014 and 03-31-2014, but the second result in that table is 04-01-2014 and 04-30-2014 and this appears to be affecting the results of the query.

My Question is: How can I change the LEFT OUTER JOIN to only use the most recently entered result and not each result found to affect my final outcome of results?

Best Answer

First consider a query that computes which rows are actually relevant from tablethree. With the assumption that with "most recently entered result" you mean "most recent enddate" the following query would gather the appropriate rows:

SELECT sid, MAX(enddate) FROM `tablethree` GROUP BY sid

Now you can build a join to retrieve not only sid, but all of the data of tablethree:

SELECT a.*
FROM tablethree a
INNER JOIN (
  SELECT sid, MAX(enddate) FROM `tablethree` GROUP BY sid
) b
ON a.sid = b.sid AND a.enddate = b.enddate

This is the result set you actually want to "left join in". You have to insert this into your original query:

SELECT t1.*
FROM tableone AS t1
INNER JOIN tabletwo AS t2
  ON t1.cid = t2.id
LEFT OUTER JOIN (
  SELECT a.*
  FROM tablethree a
  INNER JOIN (
    SELECT sid, MAX(enddate) FROM `tablethree` GROUP BY sid
  ) b
  ON a.sid = b.sid AND a.enddate = b.enddate
) AS t3
  ON t3.sid = t2.sid
WHERE t1.fieldone = 1 
  AND t1.odate NOT BETWEEN t3.startdate AND t3.enddate

What should also work is the following:

SELECT t1.*
FROM tableone AS t1
INNER JOIN tabletwo AS t2
  ON t1.cid = t2.id
LEFT OUTER JOIN tablethree AS t3
  ON t3.sid = t2.sid
LEFT OUTER JOIN (
  SELECT sid, MAX(enddate) FROM `tablethree` GROUP BY sid
) mostrecent
  ON t3.sid = mostrecent.sid AND t3.enddate = mostrecent.enddate

WHERE t1.fieldone = 1 
  AND t1.odate NOT BETWEEN t3.startdate AND t3.enddate
  AND mostrecent.enddate IS NULL

This includes both tablethree and the new SELECT as left joins, and sorts out the rows where mostrecent.enddate IS NULL (meaning those rows which are actually not most recent). This should lead to the same result, but MySQL may be able to compute this result a little faster. EXPLAIN on both queries should reveal possible differences in computation.