MySQL Join Fails – How to Return Zero Results

join;MySQL

My query is trying to find monthly records with missing days, ie if month has 31 days but less than 31 records I want to detect it. This works fine unless all 31 records are missing when nothing is returned! (By the way Daily is a big table). So how do I get the transmitters with no records at all in a particular month to return zero?

  SELECT SUM(w.amount) AS MonthlyTotal,COUNT(*) AS Days,DAY(LAST_DAY('2016-06-01')) AS DaysInMonth 
    FROM Daily  w,Transmitter t 
    WHERE t.SiteID='1907' AND t.UtilityID='1' AND w.TransmitterID=t.TransmitterID  AND NOT t.deleted
    AND w.TimeStamp  BETWEEN '2016-06-01' AND LAST_DAY('2016-06-01') GROUP BY t.TransmitterID;

Best Answer

Before going into the main problem, there are a few more issues with the query:

  • the use of string literals ('1907' '1') for values that are compared with columns that seem to have integer type (SiteID, UtilityID). If the columns are integers, use integers, not strings:

    WHERE ... t.SiteID = 1907 AND t.UtilityID = 1
    
  • using BETWEEN and LAST_DAY() for datetime/timestamp comparisons. This - assuming that the w.Timestamp column is indeed a TIMESTAMP will give you incorrect results, unless all your timestamps have 00:00:00 time part. The LAST_DAY('2016-01-01') will be '2016-01-31 00:00:00' and you lose the whole last day of the month, except the very first second.
    If you change that to BETWEEN '2016-06-01' AND '2016-07-01 is somewhat better but still wrong as you you'll get a few results from the next day (in July)!
    One way that works correctly and with all datatypes (DATE, DATETIME, TIMESTAMP) is to use inclusive-exclusive range checks, with >= and <:

    WHERE ... w.TimeStamp >= '2016-06-01' AND w.TimeStamp < '2016-07-01'
    

    If the t.Timestamp column is of DATE type, then ok, BETWEEN can be used although I prefer the consistency of the above suggested code.

  • using old ANSI syntax without JOIN. This is not an error but it's harder for debugging when there are many tables and error-prone (we might forget a joining clause):

    FROM Daily w, Transmitter t 
    WHERE ... AND w.TransmitterID = t.TransmitterID AND ...
    

    It's better (in my opinion) to use the new (since 1992 ;) explicit JOIN syntax. It's also easier to change an INNER join to a LEFT or RIGHT outer join:

    FROM Daily w JOIN Transmitter t
         ON w.TransmitterID = t.TransmitterID  
    WHERE ... AND w.TransmitterID = t.TransmitterID AND ...
    

Now for the main problem, the solution is to start from the Trasmitter table and then LEFT JOIN the details (Daily) table:

SELECT 
    t.TransmitterID,
    SUM(w.amount)                        AS MonthlyTotal,
    COUNT(w.TransmitterID)               AS Days,
    DATEDIFF('2016-07-01', '2016-06-01') AS DaysInMonth 
FROM Transmitter t LEFT JOIN Daily w 
               ON  w.TransmitterID = t.TransmitterID  
               AND w.TimeStamp >= '2016-06-01' 
               AND w.TimeStamp  < '2016-07-01'
WHERE t.SiteID = 1907 
  AND t.UtilityID = 1 
  AND NOT t.deleted
GROUP BY t.TransmitterID ;