Sql-server – Sum multiple columns, based on distinct values in different Columns

group bysql serversum

I have the following table, in an Azure SQL DB that has duplicate values that I'm trying to Sum.

Here is the Logic: If the PaymentID is unique, then Sum Payment,
If the creditID is unique, then sum credit,
if the debitid is unique, then sum debit.
And take the max(source)
The idea is to get a single line, per ID with distinct values for the ID summed.

id	payment	credit	debit	Source	paymentid	creditid	debitid
1510142123	-589.53	0	0	CC	5831879	NULL	NULL
1510142123	-589.53	0	0	CC	5831882	NULL	NULL
1510142123	-155.06	0	0	CC	5898896	NULL	NULL
157771145	-126.42	0	0	CC	5885900	NULL	NULL
157771145	-58.73	0	0	CC	5885903	NULL	NULL
158088837	-55.14	0	-3.45	CC	5897306	NULL	5897303
158088837	-5.75	0	-3.45	CC	5897309	NULL	5897303
158464166	-161	0	-3.45	CC	5910551	NULL	5910548
158464166	-24.15	0	-3.45	CC	5910554	NULL	5910548
1591970734	-111.61	0	0	Bank	5939648	NULL	NULL
1591970734	-0.01	0	0	Cash	5939711	NULL	NULL
1591970734	-0.01	0	0	Cash	5939714	NULL	NULL
159297565	-708.93	20	0	CC	5943728	5910848	NULL
159297565	-0.02	20	0	Cash	5948207	5910848	NULL

For example:

158464166 | -185.15 | 0 | -3.45 | CC | 5910551 | 5910548

(in the above – I've taken the min(paymentid) to make it look nicer

Please note that although in the above snippet, Creditid and Debitid only have a duplicate ID, it's possible that they may have distinct IDs, so any code will have to be able to handle that. PaymentID will always be unique.

It is also possible that the values for payment, credit and debit may not be unique (e.g. a payment of $50 is made twice to a single ID), so we can't group on payment.

I got as far as this:

SELECT id, sum(payment), sum(credit), sum(debit), max(source), creditid, debitid  
FROM (  
  SELECT *,  
         COUNT(*) OVER (PARTITION BY id) AS cnt  
  FROM Temp_Payment) AS t  
WHERE t.cnt > 1  

group by id, creditid, debitid

but it's not giving the expected outcome.

Best Answer

For the one row you posted expected results for, this will return what you want.

In the future, please post your table as an insert script so that it's easier to work with.

SELECT
    x.*
INTO #x
FROM
(
    VALUES
        (1510142123,-589.53,0,0,'CC',5831879,NULL,NULL),
        (1510142123,-589.53,0,0,'CC',5831882,NULL,NULL),
        (1510142123,-155.06,0,0,'CC',5898896,NULL,NULL),
        (157771145,-126.42,0,0,'CC',5885900,NULL,NULL),
        (157771145,-58.73,0,0,'CC',5885903,NULL,NULL),
        (158088837,-55.14,0,-3.45,'CC',5897306,NULL,5897303),
        (158088837,-5.75,0,-3.45,'CC',5897309,NULL,5897303),
        (158464166,-161,0,-3.45,'CC',5910551,NULL,5910548),
        (158464166,-24.15,0,-3.45,'CC',5910554,NULL,5910548),
        (1591970734,-111.61,0,0,'Bank',5939648,NULL,NULL),
        (1591970734,-0.01,0,0,'Cash',5939711,NULL,NULL),
        (1591970734,-0.01,0,0,'Cash',5939714,NULL,NULL),
        (159297565,-708.93,20,0,'CC',5943728,5910848,NULL),
        (159297565,-0.02,20,0,'Cash',5948207,5910848,NULL)
)AS x (id,payment, credit, debit, [source], paymentid, creditid, debitid);

SELECT 
    x.id,
    SUM(DISTINCT y.payment) AS payment, 
    SUM(DISTINCT y.credit) AS credit, 
    SUM(DISTINCT y.debit) AS debit,
    MAX(x.source) AS source,
    MIN(x.paymentid) AS min_paymentid,
    MAX(x.debitid) AS max_debitid
FROM #x AS x
CROSS APPLY
(
    SELECT
        SUM(x2.payment) AS payment,
        SUM(x2.credit) AS credit,
        SUM(x2.debit) AS debit
    FROM #x AS x2
    WHERE x.id = x2.id
    AND   x.paymentid = x2.paymentid
    GROUP BY x2.id, 
             x2.creditid, 
             x2.debitid
) AS y
GROUP BY x.id
ORDER BY x.id;

Related Solutions

Sql-server – Query to return fields of distinct values per key

Without knowing anything about the source data, perhaps this would do what you want?

USE Test;
GO
CREATE TABLE GENDER
(
    ORG INT NOT NULL
    , GENDER VARCHAR(1) NOT NULL
);

CREATE TABLE AGE
(
    ORG INT NOT NULL
    , AGE TINYINT
);

CREATE TABLE STATES
(
    ORG INT NOT NULL
    , STATENAME VARCHAR(255)
);

INSERT INTO Gender (ORG, GENDER) VALUES (1, 'M');
INSERT INTO Gender (ORG, GENDER) VALUES (1, 'F');
INSERT INTO Gender (ORG, GENDER) VALUES (2, 'M');
INSERT INTO Gender (ORG, GENDER) VALUES (2, 'F');
INSERT INTO Gender (ORG, GENDER) VALUES (3, 'M');
INSERT INTO Gender (ORG, GENDER) VALUES (3, 'F');

INSERT INTO AGE (ORG, AGE) VALUES (1,27);
INSERT INTO AGE (ORG, AGE) VALUES (1,28);
INSERT INTO AGE (ORG, AGE) VALUES (1,29);
INSERT INTO AGE (ORG, AGE) VALUES (1,30);
INSERT INTO AGE (ORG, AGE) VALUES (2,37);
INSERT INTO AGE (ORG, AGE) VALUES (2,38);
INSERT INTO AGE (ORG, AGE) VALUES (2,39);
INSERT INTO AGE (ORG, AGE) VALUES (2,40);
INSERT INTO AGE (ORG, AGE) VALUES (3, 2);

INSERT INTO STATES (ORG, STATENAME) VALUES (1,'FL');
INSERT INTO STATES (ORG, STATENAME) VALUES (1,'GA');
INSERT INTO STATES (ORG, STATENAME) VALUES (1,'MN');
INSERT INTO STATES (ORG, STATENAME) VALUES (1,'NM');
INSERT INTO STATES (ORG, STATENAME) VALUES (2,'FL');
INSERT INTO STATES (ORG, STATENAME) VALUES (2,'MN');
INSERT INTO STATES (ORG, STATENAME) VALUES (2,'NM');
INSERT INTO STATES (ORG, STATENAME) VALUES (3,'FL');
INSERT INTO STATES (ORG, STATENAME) VALUES (3,'GA');
INSERT INTO STATES (ORG, STATENAME) VALUES (3,'NM');

CREATE TABLE FACTS
(
    ORG INT NOT NULL
    , GENDER VARCHAR(1) NULL
    , AGE INT NULL
    , STATENAME VARCHAR(255) NULL
);

INSERT INTO FACTS (ORG, GENDER, AGE, STATENAME)
SELECT ORG, GENDER, NULL, NULL
FROM GENDER
GROUP BY ORG, GENDER
UNION ALL
SELECT ORG, NULL, AGE, NULL
FROM AGE
GROUP BY ORG, AGE
UNION ALL
SELECT ORG, NULL, NULL, STATENAME
FROM STATES;

SELECT *
FROM FACTS
ORDER BY ORG;

The results:

enter image description here

This will create a FACTS table that has all the data from several source tables. As @ypercube and @jon-seigel said, this really doesn't make much sense; perhaps we are missing something compelling about your setup.

If this is not what you were expecting, please provide the source tables, and any other pertinent details.

Sql-server – Sum currencies excluding duplicatied records

I agree with oNare that adding a sqlFiddle that builds a simplified version of your data model, shows what you have tried so far, and shows the desired results would be most helpful.

That said, you might want something like the following in order to avoid double-counting the sales for any customers that were given multiple offers. I've put inline comments quoting the relevant parts of your question in this example query:

SELECT s.customerId, s.totalSellValue, o.* -- "other CRMOffer data"
FROM CRMOffer o
JOIN (
    -- "I must then group by CRMOffers customerID and sum sell values, but sum only once for each SellPK"
    -- Compute total sell value for each customer before joining to CRMOffer in order to avoid double-counting
    -- NOTE: Grouping by Sell.customerId is the same as grouping by CRMOffers.customerId since the two must match
    SELECT customerId, SUM(sellValue) AS totalSellValue
    FROM Sell
    WHERE customerId IS NOT NULL -- "If Sell table doesnt have a specific customerID, sell value will be null"
    GROUP BY customerId
) s
    ON s.customerId = o.customerId

Best Answer

Related Solutions

Sql-server – Query to return fields of distinct values per key

Sql-server – Sum currencies excluding duplicatied records

Related Question