Sql-server – Sum currencies excluding duplicatied records

duplicationgroup byjoin;sql serversum

I have a CRMOffer table containing customerID and other CRMOffer data. And a Sell table containing customerID and sellValue and sellPK.

Sell table has a explicit PK, CRMOffer sadly doesn't have one available.

I must join these tables, so that I can relate CRM offers to actual sells.

If Sell table doesn't have a specific customerID, sell value will be null. But if it does, and there are multiple CRM offers to the same customer ID, the join will result in duplicated SellPK and sellValue.

I must then group by CRMOffer's customerID and sum sell values, but sum only once for each SellPK.

A given sell event may be duplicated, in case CRM made multiple offers to a customer that generated a sell. But there may also be multiple distinct sells from the same customer.

Sorry I think the explanation is kinda confusing. Basically I must sum values of distinct sells being careful to exclude duplications resulted to multiple CRM offers. I can see if a sell is duplicated by its SellPK. I can use count(DISTINCT customerID) to remove duplications from customerID, but I can't find a way to remove duplications from sell value.

Best Answer

I agree with oNare that adding a sqlFiddle that builds a simplified version of your data model, shows what you have tried so far, and shows the desired results would be most helpful.

That said, you might want something like the following in order to avoid double-counting the sales for any customers that were given multiple offers. I've put inline comments quoting the relevant parts of your question in this example query:

SELECT s.customerId, s.totalSellValue, o.* -- "other CRMOffer data"
FROM CRMOffer o
JOIN (
    -- "I must then group by CRMOffers customerID and sum sell values, but sum only once for each SellPK"
    -- Compute total sell value for each customer before joining to CRMOffer in order to avoid double-counting
    -- NOTE: Grouping by Sell.customerId is the same as grouping by CRMOffers.customerId since the two must match
    SELECT customerId, SUM(sellValue) AS totalSellValue
    FROM Sell
    WHERE customerId IS NOT NULL -- "If Sell table doesnt have a specific customerID, sell value will be null"
    GROUP BY customerId
) s
    ON s.customerId = o.customerId