SQL Server Grouping – Grouping Sessions Using DATETIME Stamp Across Days

group bysql serversql server 2014t-sql

I am working with data which has a session id value which gets recycled over time (asp session ids from IIS to be exact).

I am trying to give them a sequence so that each instance of an ASP_SESSION_ID don't get grouped together.

For example, the data would look like this. Same session id, used in Aug 2016, then again in March 2017.

DTTM                        SESSION_ID
2016-08-29 14:24:28.450     297692378
2017-04-13 23:54:53.760     297692378
2017-04-13 23:59:53.477     297692378
2017-04-14 00:04:52.897     297692378
2017-04-14 00:04:53.790     297692378

At first I thought to just group on the date (at the DAY level), but for the example above, notice how the 2nd instance of the session id spans across midnight. That would cause a 3rd group, when really it's the same session.

So if I could rank them properly, it would be:

DTTM                        SESSION_ID      RANK
2016-08-29 14:24:28.450     297692378       1
2017-04-13 23:54:53.760     297692378       2
2017-04-13 23:59:53.477     297692378       2
2017-04-14 00:04:52.897     297692378       2
2017-04-14 00:04:53.790     297692378       2

Here, ASP_SESSION_ID should be considered a new instance of a session when > 20 minutes passed since the last request.

So, how can I group by or rank the same ASP_SESSION_ID differently as they get re-used over time? e.g. if the next request by that ASP_SESSION_ID is > 20 minutes from the last, group it/rank it differently?

I am just not sure how to attack the problem.

Here are some statements to generate the data above:

CREATE TABLE #TEST
(
DTTM DATETIME, 
SESSION_ID INT
)

INSERT INTO #TEST (DTTM, SESSION_ID)
select '2016-08-29 14:24:28.450', 297692378 union
select '2017-04-13 23:54:53.760', 297692378 union
select '2017-04-13 23:59:53.477', 297692378 union
select '2017-04-14 00:04:52.897', 297692378 union
select '2017-04-14 00:04:53.790', 297692378 

Best Answer

CREATE TABLE TEST
(
DTTM DATETIME, 
SESSION_ID INT
)

INSERT INTO TEST (DTTM, SESSION_ID)
select '2016-08-29 14:24:28.450', 297692378 union
select '2017-04-13 23:54:53.760', 297692378 union
select '2017-04-13 23:59:53.477', 297692378 union
select '2017-04-14 00:04:52.897', 297692378 union
select '2017-04-14 00:04:53.790', 297692378 union
select '2017-04-14 00:44:53.790', 297692378
GO

First I've added a new record just to check it runs for a more than 20 min.

select '2017-04-14 00:44:53.790', 297692378

Then I've added a new column called RANK to store the final result.

ALTER TABLE TEST ADD [RANK] int;

GO

I've used LAG() window function to calculate DATEDIFF between current and next row.

SELECT DTTM, SESSION_ID,
       DATEDIFF(minute, COALESCE(LAG(DTTM) OVER (ORDER BY DTTM, SESSION_ID), DTTM), DTTM) DIF_MIN
FROM   TEST
GO
DTTM                | SESSION_ID | DIF_MIN
:------------------ | ---------: | ------:
29/08/2016 14:24:28 |  297692378 |       0
13/04/2017 23:54:53 |  297692378 |  327450
13/04/2017 23:59:53 |  297692378 |       5
14/04/2017 00:04:52 |  297692378 |       5
14/04/2017 00:04:53 |  297692378 |       0
14/04/2017 00:44:53 |  297692378 |      40

Then I've used a CURSOR just to calculate RANK field. Basically it acumulates minutes between records, until it hits 20 or more minutes.

DECLARE @dttm datetime,
        @session_id int,
        @diff_min int,
        @acm_diff int,
        @rank int,
        @last_dttm datetime;

SET @diff_min = 0;
SET @acm_diff = 0;
SET @rank = 0;
SET @last_dttm = NULL;

DECLARE curMin CURSOR FAST_FORWARD  FOR
SELECT DTTM, SESSION_ID,
       DATEDIFF(minute, COALESCE(LAG(DTTM) OVER (ORDER BY DTTM, SESSION_ID), DTTM), DTTM) DIF_MIN
FROM   TEST

OPEN curMin;
FETCH NEXT FROM curMin INTO @dttm, @session_id, @diff_min;

WHILE @@FETCH_STATUS = 0  
BEGIN
    IF @last_dttm IS NULL OR @acm_diff + @diff_min > 20
    BEGIN
        SET @rank = @rank + 1;
        SET @acm_diff = 0;
    END
    ELSE
    BEGIN
        SET @acm_diff = @acm_diff + @diff_min;
    END
    
    UPDATE TEST
    SET    [RANK] = @rank
    WHERE  DTTM = @dttm
    AND    SESSION_ID = @session_id;
    
    SET @last_dttm = @dttm;
    
    FETCH NEXT FROM curMin INTO @dttm, @session_id, @diff_min;
END

CLOSE curMin;

SELECT   DTTM, SESSION_ID,
         DATEDIFF(minute, DTTM, COALESCE(LEAD(DTTM) OVER (ORDER BY DTTM, SESSION_ID), DTTM)) DIF_MIN,
         [RANK]
FROM     TEST 
ORDER BY DTTM, SESSION_ID;
GO
DTTM                | SESSION_ID | DIF_MIN | RANK
:------------------ | ---------: | ------: | ---:
29/08/2016 14:24:28 |  297692378 |  327450 |    1
13/04/2017 23:54:53 |  297692378 |       5 |    2
13/04/2017 23:59:53 |  297692378 |       5 |    2
14/04/2017 00:04:52 |  297692378 |       0 |    2
14/04/2017 00:04:53 |  297692378 |      40 |    2
14/04/2017 00:44:53 |  297692378 |       0 |    3

dbfiddle here