Mysql – Trying to return a complete column from one table, and a count from another table

join;MySQL

I currently have two tables that I am trying to pull data from, one being Customers and one being Bills.

Customers is a static table that just lists all the customers, but the reason I need this table is because it provides the actual human-friendly customer name instead of the less helpful numeric customer ID.

Bills is a frequently updated table and I want to count the number of updates per day for each customer. So I came up with the following:

SELECT c.name, COUNT(DISTINCT b.billid) from customers c
JOIN bills b 
ON b.customerid = c.customerid
WHERE b.createdate = CURDATE() -INTERVAL 1 DAY
GROUP BY b.customerid;

This works wonderfully with one problem: There are some clients that don't process on a given day. And while this is currently a small select overall (I get about 24 out of a max of 31 rows back) this data is going to be used in metrics reports and transposing the data with these missing rows could eventually become a problem.

What I want is to be able to get EVERY name from Customers showing up in the first column, and then if there is no data from Bills for that customer that day, I'd like it to show up as 0 or NULL. I've been banging my head against the wall for a while now, and I'm not sure I can actually do this or not in MySQL.

Best Answer

I think I was able to accomplish what you want using the following query:

SELECT c.name, coalesce(billcount, 0) FROM customers c
LEFT JOIN (
    SELECT c.name, c.customerid, COUNT(DISTINCT b.billid) as billcount from customers c
    JOIN bills b 
    ON b.customerid = c.customerid
    WHERE b.createdate = CURDATE() -INTERVAL 1 DAY
    GROUP BY b.customerid
) bc
on bc.customerid = c.customerid

Essentially I just put your original query in a subquery (the indented area between the parenthesis), and left joined the columns you wanted with the results from that subquery.

This will always return a full list of customers and the number of bills they have, if any. Right now it's returning 0 for the customers that don't have any bills in the given time range. If you prefer it to return null, omit the coalesce() function and just use billcount in the column list on the first line. (Obviously you can rename this column to whatever you want.)