Sql-server – split one column into two columns based on it’s value using t/sql

querysql serversql server 2014t-sql

Hope you're doing well.
I have a table as you can see below :

ACNTHEADER(DDATE DATE ,ACNTCODE INT,VALUE NUMERIC(17,2))

the data in the table :

DDATE       ACNTCODE     VALUE 
6/1/2013      1174       10.00
6/1/2013      1174       20.00
6/1/2013      1174       -3.00
6/5/2013      1174       -5.00
6/7/2013      1174       -1.00
6/8/2013      1174       -1.00
6/8/2013      1174       -1.00
6/8/2013      1174       -5.00
6/10/2013     1174        3.00
6/10/2013     1174       -3.00

I need to have blocks of data for each 'DATE' and the result should be like this :

DATE        ACNTCODE       SUMPOSITIVE   SUMNEGATIVE
6/1/2013      1174            30.00         -3.00
6/2/2013      1174              0              0
6/3/2013      1174              0              0
6/4/2013      1174              0              0
6/5/2013      1174              0           -5.00
6/6/2013      1174              0              0
6/7/2013      1174              0           -1.00
6/8/2013      1174              0           -7.00
6/9/2013      1174              0              0
6/10/2013     1174            +3.00         -3.00

I do not know how to split column value into two columns based on their value and another thing is that the way I'm building blocks of data is that I', joining my tables with date table in our database which is standard date table . I want to know whether there are better ways to do that.
this my query so far:

  SELECT DISTINCT 
         D.MDATE ,
         A.ACNTCODE ,
         sum(CASE WHEN A.VALUE > 0 THEN A.VALUE ELSE 0 END) AS POSITIVEE,
         sum(CASE WHEN A.VALUE < 0 THEN A.VALUE ELSE 0 END) AS NEGATIVE       
  FROM ACNTHEADER A INNER JOIN DIMDATE D
      ON D.MDATE >= A.DDATE AND D.MDATE <= (SELECT MAX(B.DDATE)
                                    FROM ACNTHEADER B
                                    WHERE B.ACNTCODE = A.ACNTCODE)
  group by  D.MDATE ,
      A.ACNTCODE 
  ORDER BY D.MDATE

I have just edited my query. The problem now is that for those dates that are not in the ACNTHEADERtable I need to have zero value but the final result is not correct with my query . How to I handle that?

Best Answer

You are in the right track just a few things to keep in mind.

Make a left join to the date table (DIMDATE).

FROM DIMDATE D LEFT JOIN ACNTHEADER A

Which make sure that no days are omitted.

Since you are trying to add the values based on date, you need an Aggregate function in the final columns.SUM in your case.
To split the columns based on the value you can use the below.

SUM(CASE WHEN VALUE>0 THEN VALUE ELSE 0 END) AS SUMPOSITIVE

Try it and see if face any roadblocks if any Please comment.

ANSWER

IF OBJECT_ID('tempdb..#ACNTHEADER') IS NOT NULL
DROP TABLE #ACNTHEADER

IF OBJECT_ID('tempdb..#DIMDATE') IS NOT NULL
DROP TABLE #DIMDATE


CREATE TABLE #ACNTHEADER
(DDATE DATE,
 ACNTCODE INT,
 VALUE NUMERIC(17,2))


 CREATE TABLE #DIMDATE(
 MDATE DATE
 )


INSERT INTO #ACNTHEADER
SELECT '20130601',      1174,       10.00  UNION ALL
SELECT '20130601',      1174,       20.00  UNION ALL
SELECT '20130601',      1174,       -3.00  UNION ALL
SELECT '20130605',      1174,       -5.00  UNION ALL
SELECT '20130607',      1174,       -1.00  UNION ALL
SELECT '20130608',      1174,       -1.00  UNION ALL
SELECT '20130608',      1174,       -1.00  UNION ALL
SELECT '20130608',      1174,       -5.00  UNION ALL
SELECT '20130610',      1174,        3.00  UNION ALL
SELECT '20130610',      1174 ,      -3.00


INSERT INTO #DIMDATE
select TOP 30 DATEADD (day,ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1,'20130601')
FROM sys.tables,sys.columns




SELECT MDATE,
       CR.ACNTCODE,
       ISNULL(POSITIVEE,0)POSITIVEE,
       ISNULL(NEGATIVE,0)NEGATIVE    
  FROM #DIMDATE D 
  CROSS JOIN (
              SELECT DISTINCT ACNTCODE
              FROM #ACNTHEADER
              )CR
  LEFT JOIN 
       (SELECT A.ddate ,
               A.ACNTCODE ,
               SUM(CASE WHEN A.VALUE > 0 THEN A.VALUE ELSE 0 END) AS POSITIVEE,
               SUM(CASE WHEN A.VALUE < 0 THEN A.VALUE ELSE 0 END) AS NEGATIVE    
          FROM #ACNTHEADER A 
         GROUP BY A.ddate,
               A.ACNTCODE) AD on ad.DDATE = D.MDATE 
                        AND cr.ACNTCODE = ad.ACNTCODE

I hope your date filter condition is correct, so I am not editing it.

Explanation

Use of LEFT JOIN

This is to make sure that you need all the values for June month for 2013. If you do an INNER JOIN it will only show the values in matching in both table.
Doing Pre-Aggregation before doing a LEFT JOIN

For the first answer, I posted was getting duplicated records. I was in the pre-assumption that the date was joined multiple times and I was wrong.The best way to understand this is by comparing both the result set. Sometimes it hard to visualize so break down your query and see how the JOIN works.

First Answer


    SELECT D.MDATE ,
           A.ACNTCODE ,
           SUM(CASE WHEN A.VALUE > 0 THEN A.VALUE ELSE 0 END) AS POSITIVEE,
           SUM(CASE WHEN A.VALUE < 0 THEN A.VALUE ELSE 0 END) AS NEGATIVE

      FROM #DIMDATE D 
      LEFT JOIN 
           #ACNTHEADER A 
           ON D.MDATE >= A.DDATE AND D.MDATE <= (SELECT MAX(B.DDATE)
                                FROM #ACNTHEADER B
                                WHERE B.ACNTCODE = A.ACNTCODE)
     GROUP BY  D.MDATE,
           A.ACNTCODE 
      ORDER BY D.MDATE

before this will check the one without the filtering clause on this one.


    SELECT D.MDATE ,
           A.ACNTCODE ,
           SUM(CASE WHEN A.VALUE > 0 THEN A.VALUE ELSE 0 END) AS POSITIVEE,
           SUM(CASE WHEN A.VALUE < 0 THEN A.VALUE ELSE 0 END) AS NEGATIVE

      FROM #DIMDATE D 
      LEFT JOIN 
           #ACNTHEADER A 
           ON D.MDATE = A.DDATE 
     GROUP BY  D.MDATE,
           A.ACNTCODE 
      ORDER BY D.MDATE

╔════════════╦══════════╦═══════════╦══════════╗
║   MDATE    ║ ACNTCODE ║ POSITIVEE ║ NEGATIVE ║
╠════════════╬══════════╬═══════════╬══════════╣
║ 2013-06-01 ║ 1174     ║     30.00 ║    -3.00 ║
║ 2013-06-02 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-03 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-04 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-05 ║ 1174     ║      0.00 ║    -5.00 ║
║ 2013-06-06 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-07 ║ 1174     ║      0.00 ║    -1.00 ║
║ 2013-06-08 ║ 1174     ║      0.00 ║    -7.00 ║
║ 2013-06-09 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-10 ║ 1174     ║      3.00 ║    -3.00 ║
║ 2013-06-11 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-12 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-13 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-14 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-15 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-16 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-17 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-18 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-19 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-20 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-21 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-22 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-23 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-24 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-25 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-26 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-27 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-28 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-29 ║ NULL     ║      0.00 ║     0.00 ║
║ 2013-06-30 ║ NULL     ║      0.00 ║     0.00 ║
╚════════════╩══════════╩═══════════╩══════════╝

In which we are getting the expected result set, One thing to note is the account was not shown on all rows but only on the valid dates on the right-hand table (#ACNTHEADER) and the requirement is to show account in all rows.

How to make the account duplicate for each row in table A (DIMDATE). For this, we use CROSS JOIN


SELECT MDATE,
       CR.ACNTCODE

  FROM #DIMDATE D 
  CROSS JOIN (
              SELECT DISTINCT ACNTCODE
              FROM #ACNTHEADER
              )CR

╔════════════╦══════════╗
║   MDATE    ║ ACNTCODE ║
╠════════════╬══════════╣
║ 2013-06-01 ║     1174 ║
║ 2013-06-02 ║     1174 ║
║ 2013-06-03 ║     1174 ║
║ 2013-06-04 ║     1174 ║
║ 2013-06-05 ║     1174 ║
║ 2013-06-06 ║     1174 ║
║ 2013-06-07 ║     1174 ║
║ 2013-06-08 ║     1174 ║
║ 2013-06-09 ║     1174 ║
║ 2013-06-10 ║     1174 ║
║ 2013-06-11 ║     1174 ║
║ 2013-06-12 ║     1174 ║
║ 2013-06-13 ║     1174 ║
║ 2013-06-14 ║     1174 ║
║ 2013-06-15 ║     1174 ║
║ 2013-06-16 ║     1174 ║
║ 2013-06-17 ║     1174 ║
║ 2013-06-18 ║     1174 ║
║ 2013-06-19 ║     1174 ║
║ 2013-06-20 ║     1174 ║
║ 2013-06-21 ║     1174 ║
║ 2013-06-22 ║     1174 ║
║ 2013-06-23 ║     1174 ║
║ 2013-06-24 ║     1174 ║
║ 2013-06-25 ║     1174 ║
║ 2013-06-26 ║     1174 ║
║ 2013-06-27 ║     1174 ║
║ 2013-06-28 ║     1174 ║
║ 2013-06-29 ║     1174 ║
║ 2013-06-30 ║     1174 ║
╚════════════╩══════════╝

Why I used DISTINCT in SUB QUERY for CROSS JOIN is to minimize the rows to process. As it can lead to multiplying the rows based on the number of rows in each table. if DISTINCT is not used this will lead to wrong results as it will have duplicates. Also, be careful when doing CROSS JOIN as it can be a resource consuming process.

Now all we have to do the aggregation on the value that matches with the value in #ACNTHEADER for which we use a LEFT JOIN on the #ACNTHEADER table


SELECT MDATE,
       CR.ACNTCODE,
       ISNULL(SUM(CASE WHEN A.VALUE > 0 THEN A.VALUE ELSE 0 END),0) POSITIVE ,
       ISNULL(SUM(CASE WHEN A.VALUE < 0 THEN A.VALUE ELSE 0 END) ,0) NEGATIVE

  FROM #DIMDATE D 
  CROSS JOIN (
              SELECT DISTINCT ACNTCODE
              FROM #ACNTHEADER
              )CR
  LEFT JOIN

           #ACNTHEADER A ON a.ACNTCODE = cr.ACNTCODE 
                       AND d.MDATE= a.DDATE
         GROUP BY  MDATE,
       CR.ACNTCODE

Which is the required result set. Now, what went wrong on the first query. Even the pre-aggregation is not required! And all this was because of the filter clause on DATE value which resulted in duplicates.


D.MDATE >= A.DDATE AND D.MDATE <= (SELECT MAX(B.DDATE)
                                FROM #ACNTHEADER B
                                WHERE B.ACNTCODE = A.ACNTCODE)

I am leaving this to the reader to see how the filter works by running the query for one row and see how its getting changed which will be a good exercise for you analyzing SQL code.

NB Also, note I have only considered the June month for 2013, change dimdate table according to your need.

Tables

-- Original table
CREATE TABLE dbo.History
(
    Effective   datetime NOT NULL,      -- When the value became current
    Product     integer NOT NULL,       -- The product
    Kind        tinyint NOT NULL,       -- The price kind (RRP, special, etc)
    Price       smallmoney NOT NULL,    -- The new price

    CONSTRAINT [PK dbo.History Effective, Product, Kind]
        PRIMARY KEY CLUSTERED
            (Effective, Product, Kind)
);

-- Holding area for duplicates that may be needed in future
CREATE TABLE dbo.HistoryDuplicates
(
    Effective   datetime NOT NULL,
    Product     integer NOT NULL,
    Kind        tinyint NOT NULL,
    Price       smallmoney NOT NULL,

    CONSTRAINT [PK dbo.HistoryDuplicates Product, Kind, Effective]
        PRIMARY KEY CLUSTERED
            (Product, Kind, Effective)
);

Initial Data

CREATE TABLE #NewRows
(
    Effective   datetime NOT NULL,
    Product     integer NOT NULL,
    Kind        tinyint NOT NULL,
    Price       smallmoney NOT NULL,
    [Action]    char(1) NOT NULL DEFAULT 'X',
);

CREATE UNIQUE CLUSTERED INDEX cuq
ON #NewRows (Product, Kind, Effective);

INSERT #NewRows
    (Effective, Product, Kind, Price)
VALUES
    ('2013-04-23T00:23:00', 1234, 1, 1.00),
    ('2013-04-24T00:24:00', 1234, 1, 1.00),
    ('2013-04-25T00:25:00', 1234, 1, 1.50),
    ('2013-04-25T00:25:00', 1234, 2, 2.00),
    ('2013-04-26T00:26:00', 1234, 1, 2.00),
    ('2013-04-27T00:27:00', 1234, 1, 2.00),
    ('2013-04-28T00:28:00', 1234, 1, 1.00);

The first step is to remove any redundancies in the input data, storing the removed data in the new holding table:

-- Remove redundancies in the input set
-- and save in the duplicates table
DELETE NR
OUTPUT
    DELETED.Effective,
    DELETED.Product,
    DELETED.Kind,
    DELETED.Price
INTO dbo.HistoryDuplicates
    (Effective, Product, Kind, Price)
FROM #NewRows AS NR
OUTER APPLY
(
    SELECT TOP (1)
        NR2.Price
    FROM #NewRows AS NR2
    WHERE
        NR2.Product = NR.Product
        AND NR2.Kind = NR.Kind
        AND NR2.Effective < NR.Effective
    ORDER BY
        NR2.Effective DESC
) AS X
WHERE
    EXISTS (SELECT X.Price INTERSECT SELECT NR.Price);

De-duplication execution plan

Classifying input rows

The next step is to decide whether each row in the input table is redundant (w.r.t the History table) or not. The following query sets the Action column of the input set data appropriately:

-- Decide what to do with each row
UPDATE NewRows
SET [Action] = 
    CASE
        WHEN NOT EXISTS (SELECT ExistingRow.Price INTERSECT SELECT NewRows.Price)
            THEN 'I' -- Insert
        WHEN ExistingRow.Price = NewRows.Price
            THEN 'D' -- Duplicate
        ELSE 'X'
    END
FROM #NewRows AS NewRows
OUTER APPLY
(
    SELECT TOP (1)
        H.Price,
        H.Effective
    FROM dbo.History AS H
    WHERE
        H.Product = NewRows.Product
        AND H.Kind = NewRows.Kind
        AND H.Effective <= NewRows.Effective
    ORDER BY
        H.Effective DESC
) AS ExistingRow;

CREATE UNIQUE CLUSTERED INDEX cuq
ON #NewRows 
    ([Action], Product, Kind, Effective) 
WITH (DROP_EXISTING = ON);

Action execution plan

Store Redundant Rows

Now we store the rows identified as redundant to the holding table:

-- Store duplicates
WITH Duplicates AS
(
    SELECT NR.*
    FROM #NewRows AS NR
    WHERE NR.[Action] = 'D'
)
MERGE dbo.HistoryDuplicates AS HD
USING Duplicates AS NR
    ON NR.Product = HD.Product
    AND NR.Kind = HD.Kind
    AND NR.Effective = HD.Effective
WHEN NOT MATCHED BY TARGET THEN 
    INSERT (Product, Kind, Effective, Price) 
    VALUES (Product, Kind, Effective, Price);

Redundant records plan

New History rows

The non-redundant rows are added to the History table:

-- Inserts
WITH Inserts AS
(
    SELECT NR.*
    FROM #NewRows AS NR
    WHERE NR.[Action] = 'I'
)
MERGE dbo.History AS H
USING Inserts AS NR
    ON NR.Product = H.Product
    AND NR.Kind = H.Kind
    AND NR.Effective = H.Effective
    AND NR.Price = H.Price
WHEN NOT MATCHED BY TARGET THEN 
    INSERT (Effective, Product, Kind, Price)
    VALUES (Effective, Product, Kind, Price);

New rows plan

Reinstating redundant records

Adding new records can result in redundant rows needing to be reinstated. The following query identifies qualifying redundant rows and moves them to the History table:

DELETE NextDuplicate
OUTPUT DELETED.Product,
       DELETED.Kind,
       DELETED.Price,
       DELETED.Effective
INTO dbo.History
    (Product, Kind, Price, Effective)
FROM #NewRows AS NR
CROSS APPLY
(
    SELECT TOP (1)
        H.Effective,
        H.Price
    FROM dbo.History AS H
    WHERE
        H.Product = NR.Product
        AND H.Kind = NR.Kind
        AND H.Effective > NR.Effective
    ORDER BY
        H.Effective ASC
) AS NextHistory
CROSS APPLY
(
    SELECT TOP (1)
        HD.Effective,
        HD.Product,
        HD.Kind,
        HD.Price
    FROM dbo.HistoryDuplicates AS HD
    WHERE
        HD.Product = NR.Product
        AND HD.Kind = NR.Kind
        AND HD.Effective > NR.Effective
        AND HD.Effective < NextHistory.Effective
    ORDER BY
        HD.Effective ASC
) AS NextDuplicate
WHERE
    NR.[Action] = 'I'
    AND NOT EXISTS (SELECT NextDuplicate.Price INTERSECT SELECT NR.Price);

Relocate rows plan

Results

History table

╔═════════════════════════╦═════════╦══════╦═══════╗
║        Effective        ║ Product ║ Kind ║ Price ║
╠═════════════════════════╬═════════╬══════╬═══════╣
║ 2013-04-23 00:23:00.000 ║    1234 ║    1 ║ 1.00  ║
║ 2013-04-25 00:25:00.000 ║    1234 ║    1 ║ 1.50  ║
║ 2013-04-25 00:25:00.000 ║    1234 ║    2 ║ 2.00  ║
║ 2013-04-26 00:26:00.000 ║    1234 ║    1 ║ 2.00  ║
║ 2013-04-28 00:28:00.000 ║    1234 ║    1 ║ 1.00  ║
╚═════════════════════════╩═════════╩══════╩═══════╝

History Duplicates table

╔═════════════════════════╦═════════╦══════╦═══════╗
║        Effective        ║ Product ║ Kind ║ Price ║
╠═════════════════════════╬═════════╬══════╬═══════╣
║ 2013-04-24 00:24:00.000 ║    1234 ║    1 ║ 1.00  ║
║ 2013-04-27 00:27:00.000 ║    1234 ║    1 ║ 2.00  ║
╚═════════════════════════╩═════════╩══════╩═══════╝

Processing new data

The preceding steps are quite general. We can process a new batch of rows using exactly the same code. The next script loads the input table with two sample rows, one of which is a duplicate, and the other an example of needing to reinstate a previously-redundant row:

-- If needed
DROP TABLE #NewRows;

CREATE TABLE #NewRows
(
    Effective   datetime NOT NULL,
    Product     integer NOT NULL,
    Kind        tinyint NOT NULL,
    Price       smallmoney NOT NULL,
    [Action]    char(1) NOT NULL DEFAULT 'X',
);

CREATE UNIQUE CLUSTERED INDEX cuq
ON #NewRows (Product, Kind, Effective);

INSERT #NewRows
    (Effective, Product, Kind, Price)
VALUES
    ('2013-04-24 00:24:01.000', 1234, 1, 1.00), -- New duplicate
    ('2013-04-26 00:26:30.000', 1234, 1, 5.00); -- Complex new row

Running the rest of the general script produces this final state:

History:

╔═════════════════════════╦═════════╦══════╦═══════╗
║        Effective        ║ Product ║ Kind ║ Price ║
╠═════════════════════════╬═════════╬══════╬═══════╣
║ 2013-04-23 00:23:00.000 ║    1234 ║    1 ║ 1.00  ║
║ 2013-04-25 00:25:00.000 ║    1234 ║    1 ║ 1.50  ║
║ 2013-04-25 00:25:00.000 ║    1234 ║    2 ║ 2.00  ║
║ 2013-04-26 00:26:00.000 ║    1234 ║    1 ║ 2.00  ║
║ 2013-04-26 00:26:30.000 ║    1234 ║    1 ║ 5.00  ║ <- New
║ 2013-04-27 00:27:00.000 ║    1234 ║    1 ║ 2.00  ║ <- Recovered
║ 2013-04-28 00:28:00.000 ║    1234 ║    1 ║ 1.00  ║
╚═════════════════════════╩═════════╩══════╩═══════╝

History Duplicate:

╔═════════════════════════╦═════════╦══════╦═══════╗
║        Effective        ║ Product ║ Kind ║ Price ║
╠═════════════════════════╬═════════╬══════╬═══════╣
║ 2013-04-24 00:24:00.000 ║    1234 ║    1 ║ 1.00  ║
║ 2013-04-24 00:24:01.000 ║    1234 ║    1 ║ 1.00  ║ <- New duplicate
╚═════════════════════════╩═════════╩══════╩═══════╝

Best Answer

Related Solutions

Sql-server – Select multiple rows from one row based on column values

Sql-server – How to merge data sets without including redundant rows