Currently working on a thesis where I have been tasked with aggregating data to a new table

I'm trying to build out an aggregated table where I'll store each row group by date and a time interval (let's say 30 minutes).

I came across store procedures and think that this would be the best solution.

My source table is constructed as followed (StartDateTime = datetime2) (

StartDateTime           Name    ...   Path         
2021-02-28 20:23:43.1   Tom     ...   FG_T
2021-02-28 20:24:34.6   Tom     ...   LS_t
 ---                    ---     ...   ---
2021-02-28 20:35:09.5   Abby    ...   FS_t
2021-02-28 21:55:34.6   Tom     ...   FS_t
2021-XX-XX HH:MM:SS.s   Abby     ...  XX

My desire is to have an output table (must be a table I want to store it over time, not just temp) that counts the amount events that have happened within a given period, this procedure should be run once a day and create an entry for every available "timeslot" from the last 24h.

PeriodStart              Period(sek)    Name     Sum
2021-02-28 20:00:00.0    1800           Tom      2
2021-02-28 20:30:00.0    1800           Abby     1
2021-02-28 21:00:00.0    1800           NULL     NULL
2021-02-28 21:30:00.0    1800           Tom      1

Could someone point me in the right direction on where I could find information about creating something like this?

My thanks in advance from an aspiring SQL enthusiast 🙂

Update 1:

As demonstrated within @SMor's answer, I'm seeking to use some sort of tally table to generate timeslots.

Once I looked at the code he provided, I realized that I had forgotten about a few constraints that I would like to address

Clarifying my desired outcome and structure of the current tables:

This task is meant to summarize several connections coming in via a given set of ports (trunks), the newly create table (the one formed within this process) shall later be used as a hooking point for a corresponding View that will be used within an SSRS.

My goal is to count the number of incoming connections within a given timeslot per trunk (hence group by trunk as well).
Once I get this first part in place, it would also be much desired to see the max amount of connections at the same time within a given slot. (Such as MaxConnectionAtOnce )

Table: RAW_DATA

GatewayName	StartDateTime	DisconnectDateTime	ConnectionDuration	Trunk
GW1	2021-02-24 20:20:43.1	2021-02-24 20:22:32.9	52964	T1
GW1	2021-02-24 20:21:23.6	2021-02-24 20:21:55.1	51036	T1
GW1	2021-02-24 20:26:50.0	2021-02-24 20:28:24.0	9490	T1
GW2	2021-02-24 20:41:49.0	2021-02-24 20:43:24.0	9490	T2
GW3	2021-02-24 22:46:54.2	2021-02-24 22:48:25.2	9087	T1
GW99	2021-02-24 22:47:25.1	2021-02-24 22:47:54.4	2917	T2

I tried to add it into a store procedure, I have added block comments ( /* */ ) around the modifications I did.

CREATE OR ALTER PROCEDURE [schemma].[TrunkSummery]
@date datetime2(7),
@period int
AS
BEGIN
    -- Creat timeslotable
        with numbers(val) as 
            (select 1 union all select val + 1 from numbers where val < 48)
        select @date, nbr.val, 
           dateadd(minute, (nbr.val - 1) * 30, @date) as period_start, 
           dateadd(minute, (nbr.val    ) * 30, @date) as period_end 
        from numbers as nbr 
        order by nbr.val;

        with numbers(val) as 
            (select 1 union all select val + 1 from numbers where val < 48),
        periods as (
            select @date as [date], nbr.val, 
           dateadd(minute, (nbr.val - 1) * 30, @date) as period_start, 
           dateadd(minute, (nbr.val    ) * 30, @date) as period_end 
           from numbers as nbr)
        select pers.period_start, @period as Period, src.trunk, count(src.trunk) as 'all' 
        /*, count(maxAtOnce) */ 
        --from periods as pers left  join @raw_data as src 
        from periods as pers inner join HDO.RAW_DATA as src 
        on src.StartDateTime >= pers.period_start and src.StartDateTime < pers.period_end
        group by src.trunk, pers.period_start, pers.period_end
        order by pers.period_start, src.trunk,  pers.period_end
END
GO

EXECUTE [schemma].[TrunkSummery] @date = '20210224', @period = 30;
GO

Based on my attempts I think that I somehow need group/order on the trunks as well.

--Current output
period_start                 Period   trunk     all
2021-02-24 20:00:00.0000000  30       T1        3
2021-02-24 20:30:00.0000000  30       T1        1
2021-02-24 20:30:00.0000000  30       T2        1
2021-02-24 22:30:00.0000000  30       T1        1
2021-02-24 22:30:00.0000000  30       T2        1


-- Desired
period_start                 Period   trunk     all    max
2021-02-24 20:00:00.0000000  30       T1        3      2
2021-02-24 20:30:00.0000000  30       T1        1      1
2021-02-24 22:30:00.0000000  30       T1        1      1
2021-02-24 20:30:00.0000000  30       T2        1      1
2021-02-24 22:30:00.0000000  30       T2        1      1

Is this fusible?

Fiddle based on @SMor with new table

Best Answer

You need a tally table (also called a numbers table) to start. These have many uses and can easily be used to generate intervals like you need.

The idea is to generate a sequence of number corresponding to the intervals in a day. You can cross join that against a table of dates to get the complete set of intervals for all the dates of concern to get every possible interval.

With that resultset, you can then join to your source data and aggregate as you typically do. You can inner join if you only want the intervals that exist in your source data or you can outer join to get all intervals. Note that I use the lazy way of generating the numbers (recursive cte) - but the list is very small for this requirement. I also only considered a single date which was set in a local variable @date. Therefore that skips the need for a cross join against your set of dates.

with numbers(val) as 
    (select 1 union all select val + 1 from numbers where val < 48),
periods as (
    select @date as [date], nbr.val, 
   dateadd(minute, (nbr.val - 1) * 30, @date) as period_start, 
   dateadd(minute, (nbr.val    ) * 30, @date) as period_end 
   from numbers as nbr)
select pers.period_start, pers.period_end, src.name, count(src.name) as cnt 
--from periods as pers left  join @source as src 
from periods as pers inner join @source as src 
on src.startdt >= pers.period_start and src.startdt < pers.period_end
group by pers.period_start, pers.period_end, src.name
order by pers.period_start, pers.period_end, src.name

I leave you a fiddle to demonstrate.

SQL Server – Group Multiple Tables by Time Interval to New Table Using Stored Procedure

Update 1:

Best Answer

Related Question

Update 1:

Best Answer

Related Solutions

Sql-server – Best table schema to store daily hits for many items

Sql-server – How to reduce the run time of stored procedure by getting rid of cursors

Related Question