SQL Server 2008 – Calculate Average Between Two Dates of Different Rows

sql-server-2008

I have the below table

BIDID AppID AppStatus   Time
23390   16  In Review   2017-07-03 
23390   16  Approved    2017-09-03 
23390   16  In Review   2017-10-11 
23390   16  Approved    2017-12-11 
23390   16  Approved    2017-14-11

I want to calculate the time difference according to the following logic

First find the In Review status and then get the next approved status and then calculate the time difference. Then get the next In Review status and then next Approved and calculate the time difference. Then get an average.

So in this case what I am looking for is

Say

2017-07-03  - 2017-09-03  - 2 days
    2017-10-11  - 2017-12-11   -2 days

Last Approved is ignored because there was no matching In review before it

Then get total average (2 + 2) / 2 = 2

Can someone please tell me how I can achieve this.

Thanks

Best Answer

WITH ct AS
(
SELECT
    CASE WHEN AppStatus = 'Approved'
              AND LAG(AppStatus) 
                  OVER (PARTITION BY BIDID, AppID ORDER BY [Time]) = 'In Review'
         THEN 
             DATEDIFF (day, LAG([Time]) OVER (PARTITION BY BIDID, AppId 
                                             ORDER BY [Time]), [Time])
         ELSE
             0
    END days
FROM
    t
)
SELECT
    SUM(days) / COUNT(*) Average
FROM
    ct
WHERE
    days <> 0;

The CTE part calculates DATEDIFF with the previous row, every time it finds an 'Approved' AppStatus after an 'In review'AppStatus.

BIDID | AppID | AppStatus | Time                | days
----: | ----: | :-------- | :------------------ | ---:
23390 |    16 | In Review | 07/03/2017 00:00:00 |    0
23390 |    16 | Approved  | 09/03/2017 00:00:00 |    2
23390 |    16 | In Review | 10/11/2017 00:00:00 |    0
23390 |    16 | Approved  | 12/11/2017 00:00:00 |    2
23390 |    16 | Approved  | 14/11/2017 00:00:00 |    0

The simply sums the calculated days and divide by the rows that has calculated days.

| Average |
| ------: |
|       2 |

db<>fiddle here

UPDATE

Due you are on 2008 and as per comments you pointed out there is an IDENTITY column, you can simulate LAG/LEAD function using an APPLY join with the next ID.

WITH ct AS
(
SELECT
    CASE WHEN t1.AppStatus = 'In Review' AND t2.AppStatus = 'Approved'
         THEN DATEDIFF(day, t1.[Time], t2.[Time])
         ELSE 0
    END as Days
FROM
    t t1
CROSS APPLY(SELECT TOP 1 * 
            FROM t
            WHERE id > t1.id
            ORDER BY id) t2
-- WHERE id < 100
)
SELECT
    SUM(days) / COUNT(*) Average
FROM
    ct
WHERE
    days <> 0;

db<>fiddle here

Related Solutions

SQL Server Row Differences – How to Show Rows Different Between Two Tables or Queries

You don't need 30 join conditions for a FULL OUTER JOIN here.

You can just Full Outer Join on the PK, preserve rows with at least one difference with WHERE EXISTS (SELECT A.* EXCEPT SELECT B.*) and use CROSS APPLY (SELECT A.* UNION ALL SELECT B.*) to unpivot out both sides of the JOINed rows into individual rows.

WITH TableA(Col1, Col2, Col3) 
     AS (SELECT 'Dog',1,1     UNION ALL 
         SELECT 'Cat',27,86   UNION ALL 
         SELECT 'Cat',128,92), 
     TableB(Col1, Col2, Col3) 
     AS (SELECT 'Dog',1,1     UNION ALL 
         SELECT 'Cat',27,105  UNION ALL 
         SELECT 'Lizard',83,NULL) 
SELECT CA.*
FROM   TableA A 
       FULL OUTER JOIN TableB B 
         ON A.Col1 = B.Col1 
            AND A.Col2 = B.Col2 
/*Unpivot the joined rows*/
CROSS APPLY (SELECT 'TableA' AS what, A.* UNION ALL
             SELECT 'TableB' AS what, B.*) AS CA     
/*Exclude identical rows*/
WHERE  EXISTS (SELECT A.* 
               EXCEPT 
               SELECT B.*) 
/*Discard NULL extended row*/
AND CA.Col1 IS NOT NULL      
ORDER BY CA.Col1, CA.Col2

Gives

what   Col1   Col2        Col3
------ ------ ----------- -----------
TableA Cat    27          86
TableB Cat    27          105
TableA Cat    128         92
TableB Lizard 83          NULL

Or a version dealing with the moved goalposts.

SELECT DISTINCT CA.*
FROM   TableA A 
       FULL OUTER JOIN TableB B 
         ON EXISTS (SELECT A.*  INTERSECT  SELECT B.*) 
CROSS APPLY (SELECT 'TableA' AS what, A.* UNION ALL
             SELECT 'TableB' AS what, B.*) AS CA     
WHERE NOT EXISTS (SELECT A.*  INTERSECT  SELECT B.*) 
AND CA.Col1 IS NOT NULL
ORDER BY CA.Col1, CA.Col2

For tables with many columns it can still be difficult to identify the specific column(s) that differ. For that you can potentially use the below.

(though just on relatively small tables as otherwise this method likely won't have adequate performance)

SELECT t1.primary_key,
       y1.c,
       y1.v,
       y2.v
FROM   t1
       JOIN t2
         ON t1.primary_key = t2.primary_key
       CROSS APPLY (SELECT t1.*
                    FOR xml path('row'), elements xsinil, type) x1(x)
       CROSS APPLY (SELECT t2.*
                    FOR xml path('row'), elements xsinil, type) x2(x)
       CROSS APPLY (SELECT n.n.value('local-name(.)', 'sysname'),
                           n.n.value('.', 'nvarchar(max)')
                    FROM   x1.x.nodes('row/*') AS n(n)) y1(c, v)
       CROSS APPLY (SELECT n.n.value('local-name(.)', 'sysname'),
                           n.n.value('.', 'nvarchar(max)')
                    FROM   x2.x.nodes('row/*') AS n(n)) y2(c, v)
WHERE  y1.c = y2.c
       AND EXISTS(SELECT y1.v
                  EXCEPT
                  SELECT y2.v)

Sql-server – How to print out the query SQL text of a transaction by querying the “fn_dblog” or DBCC LOG(‘DataBaseName’)

The transaction log does not not contain statements, it contains the physical changes occurred in the database. If you see a log record that indicate a delete you cannot know if this was a DELETE statement, a MERGE statement or a wide (split) UPDATE statement. If you see an operation indicating an INSERT you cannot know if it was an INSERT (...) VALUES (...) or it was an INSERT (...) SELECT (...) or it was an INSERT (...) EXEC or it was a MERGE or it was a wide (split) UPDATE. And so on and so forth. Specifically, the transaction log does not intend to substitute for an audit trace.

The transcriptional replication agent has means to reconstruct T-SQL operation with identical effect as those that changed a published article, but how it does it is not public information.

If you want to monitor data changes, use Change Tracking or Change Data Capture. If you want to monitor T-SQL activity, use profiler traces.

Best Answer

Related Solutions

SQL Server Row Differences – How to Show Rows Different Between Two Tables or Queries

Sql-server – How to print out the query SQL text of a transaction by querying the “fn_dblog” or DBCC LOG(‘DataBaseName’)

Related Question