Currently working on a thesis where I have been tasked with aggregating data to a new table
I'm trying to build out an aggregated table where I'll store each row group by date and a time interval (let's say 30 minutes).
I came across store procedures and think that this would be the best solution.
My source table is constructed as followed (StartDateTime = datetime2) (
StartDateTime Name ... Path
2021-02-28 20:23:43.1 Tom ... FG_T
2021-02-28 20:24:34.6 Tom ... LS_t
--- --- ... ---
2021-02-28 20:35:09.5 Abby ... FS_t
2021-02-28 21:55:34.6 Tom ... FS_t
2021-XX-XX HH:MM:SS.s Abby ... XX
My desire is to have an output table (must be a table I want to store it over time, not just temp) that counts the amount events that have happened within a given period, this procedure should be run once a day and create an entry for every available "timeslot" from the last 24h.
PeriodStart Period(sek) Name Sum
2021-02-28 20:00:00.0 1800 Tom 2
2021-02-28 20:30:00.0 1800 Abby 1
2021-02-28 21:00:00.0 1800 NULL NULL
2021-02-28 21:30:00.0 1800 Tom 1
Could someone point me in the right direction on where I could find information about creating something like this?
My thanks in advance from an aspiring SQL enthusiast 🙂
Update 1:
As demonstrated within @SMor's answer, I'm seeking to use some sort of tally table to generate timeslots.
Once I looked at the code he provided, I realized that I had forgotten about a few constraints that I would like to address
Clarifying my desired outcome and structure of the current tables:
This task is meant to summarize several connections coming in via a given set of ports (trunks), the newly create table (the one formed within this process) shall later be used as a hooking point for a corresponding View that will be used within an SSRS.
My goal is to count the number of incoming connections within a given timeslot per trunk (hence group by trunk as well).
Once I get this first part in place, it would also be much desired to see the max amount of connections at the same time within a given slot. (Such as MaxConnectionAtOnce )
Table: RAW_DATA
GatewayName | StartDateTime | DisconnectDateTime | ConnectionDuration | Trunk |
---|---|---|---|---|
GW1 | 2021-02-24 20:20:43.1 | 2021-02-24 20:22:32.9 | 52964 | T1 |
GW1 | 2021-02-24 20:21:23.6 | 2021-02-24 20:21:55.1 | 51036 | T1 |
GW1 | 2021-02-24 20:26:50.0 | 2021-02-24 20:28:24.0 | 9490 | T1 |
GW2 | 2021-02-24 20:41:49.0 | 2021-02-24 20:43:24.0 | 9490 | T2 |
GW3 | 2021-02-24 22:46:54.2 | 2021-02-24 22:48:25.2 | 9087 | T1 |
GW99 | 2021-02-24 22:47:25.1 | 2021-02-24 22:47:54.4 | 2917 | T2 |
I tried to add it into a store procedure, I have added block comments ( /* */ ) around the modifications I did.
CREATE OR ALTER PROCEDURE [schemma].[TrunkSummery]
@date datetime2(7),
@period int
AS
BEGIN
-- Creat timeslotable
with numbers(val) as
(select 1 union all select val + 1 from numbers where val < 48)
select @date, nbr.val,
dateadd(minute, (nbr.val - 1) * 30, @date) as period_start,
dateadd(minute, (nbr.val ) * 30, @date) as period_end
from numbers as nbr
order by nbr.val;
with numbers(val) as
(select 1 union all select val + 1 from numbers where val < 48),
periods as (
select @date as [date], nbr.val,
dateadd(minute, (nbr.val - 1) * 30, @date) as period_start,
dateadd(minute, (nbr.val ) * 30, @date) as period_end
from numbers as nbr)
select pers.period_start, @period as Period, src.trunk, count(src.trunk) as 'all'
/*, count(maxAtOnce) */
--from periods as pers left join @raw_data as src
from periods as pers inner join HDO.RAW_DATA as src
on src.StartDateTime >= pers.period_start and src.StartDateTime < pers.period_end
group by src.trunk, pers.period_start, pers.period_end
order by pers.period_start, src.trunk, pers.period_end
END
GO
EXECUTE [schemma].[TrunkSummery] @date = '20210224', @period = 30;
GO
Based on my attempts I think that I somehow need group/order on the trunks as well.
--Current output
period_start Period trunk all
2021-02-24 20:00:00.0000000 30 T1 3
2021-02-24 20:30:00.0000000 30 T1 1
2021-02-24 20:30:00.0000000 30 T2 1
2021-02-24 22:30:00.0000000 30 T1 1
2021-02-24 22:30:00.0000000 30 T2 1
-- Desired
period_start Period trunk all max
2021-02-24 20:00:00.0000000 30 T1 3 2
2021-02-24 20:30:00.0000000 30 T1 1 1
2021-02-24 22:30:00.0000000 30 T1 1 1
2021-02-24 20:30:00.0000000 30 T2 1 1
2021-02-24 22:30:00.0000000 30 T2 1 1
Is this fusible?
Best Answer
You need a tally table (also called a numbers table) to start. These have many uses and can easily be used to generate intervals like you need.
The idea is to generate a sequence of number corresponding to the intervals in a day. You can cross join that against a table of dates to get the complete set of intervals for all the dates of concern to get every possible interval.
With that resultset, you can then join to your source data and aggregate as you typically do. You can inner join if you only want the intervals that exist in your source data or you can outer join to get all intervals. Note that I use the lazy way of generating the numbers (recursive cte) - but the list is very small for this requirement. I also only considered a single date which was set in a local variable
@date
. Therefore that skips the need for a cross join against your set of dates.I leave you a fiddle to demonstrate.