Sql-server – Creating Indexed View GROUP BY Epoch Date

materialized-viewsql server

I have a few big tables with about 6 billion rows that I was looking to optimize. Clustered key is Epoch (unix date time which is the number of seconds that has passed after 1970) and customer ID. This table records usage data per customer per product type.

For example, if this were for a Telco, TypeID 1 is a local call and the value is how many minutes used for that customer. TypeID2 is a international call and is the value how many minutes were used in that hour for that customer. Let's say TypeID3 is a special discounted rate for domestic calling.

The data is stored in 1 hour intervals. I want the indexed view to store the aggregated 24 hour value so when we run a query for 1 day per customer, it has to only look up 1 row in the indexed view instead of 24 rows in the base table.

This is the base table:

ColRowID (bigint)
AggregateID (int)
Epoch (int)
CustomerID (int)
TypeID  (tinyint)
ErrorID (smallint)
Value (int)

We don't care about Aggregate or RowID for our reporting purposes, so I figure the indexed view will look like this:

CREATE VIEW [ixvw_AggTbl]
WITH SCHEMABINDING
AS
SELECT Epoch, CustomerID, TypeID, ErrorID, SUM(Value)
FROM DBO.BaseTbl
-- GROUP BY Epoch  (what goes here?? Epoch/86400?  If I do that I have to 
-- put Epoch/86400 in the SELECT list as well)

EDIT:

Sample base data ( i left out the columns we don't need in this case, just assume the ID columns are there). Each "TypeID" will have a value assigned to it, which the value can be 0.

For example,

    Epoch / Customer ID / TypeID / Value
    /* Epoch 90,000 is day 2 1am */

90000 (1am  day 2) / 1 / 1 / 200
90000 (1am  day 2) / 1 / 2 / 100
90000 (1am  day 2) / 1 / 3 / 120

/* Customer ID 2 as well */
90000 (1am  day 2) / 2 / 1 / 100
90000 (1am  day 2) / 2 / 2 / 50
90000 (1am  day 2) / 2 / 3 / 310

... (repeat for 30,000 customers)

/* Customer ID 1 2am day 1) */
93600 (2am day 2) / 1 / 1 / 150
93600 (2am day 2) / 1 / 2 / 0
93600 (2am day 2) / 1 / 3 / 550

/* Customer ID 2 2am day 2) */
93600 / 2 / 1 / 80
93600 / 2 / 2 / 150
93600 / 2 / 3 / 300
... (repeat for 30,000 customers)

Let's assume all the other VALUE columns are 0 for the remainder of the day since the system went down and no one could use their phones after 2am. I want my indexed view to record the value column aggregated per day, per customerID and TypeID.

Sample would be:

172800 (Day 3 midnight) / 1 / 1 / 350  --Cust ID 1 aggregated all type id 1 in the past 24 hours
172800 (Day 3 midnight) / 1 / 2 / 100
172800 (Day 3 midnight) / 1 / 3 / 670
172800 (Day 3 midnight) / 2 / 1 / 180  --Cust ID 2 now
172800 (Day 3 midnight) / 2 / 2 / 200
172800 (Day 3 midnight) / 2 / 3 / 610
--Repeat by adding 86400 to the epoch to gather the summary data of the rows for the previous day.

Best Answer

I think there's some misunderstanding about what you're attempting to do here.

Since your current design is to return all 24 rows from the base table, presumably all the supplementary fields are returned as well (to display in a grid, or something).

In order to fully aggregate the Value column, all the supplementary columns cannot be included in the SELECT list. Alternatively, if those columns are included in the GROUP BY clause, the view would represent only a partial aggregation, as there would be one row for each unique combination of the columns in the GROUP BY column list.

The only way I see something like this being useful is if the supplementary columns aren't included in the view, and there is some other process that requires only the daily aggregated values, without the base row data. Such a view could be defined like this:

CREATE TABLE [dbo].[BaseTbl]
(
    ColRowID bigint NOT NULL,
    AggregateID int NOT NULL,
    Epoch int NOT NULL,
    CustomerID int NOT NULL,
    TypeID tinyint NOT NULL,
    ErrorID smallint NOT NULL,
    Value int NOT NULL,

    PRIMARY KEY CLUSTERED(Epoch, CustomerId)
);
GO

CREATE VIEW [dbo].[ixvw_AggTbl]
    WITH SCHEMABINDING
AS
    SELECT
        t.Epoch / 86400 AS EpochDay,
        CustomerID,
        TypeID,
        SUM(t.Value) AS TotalValue,
        COUNT_BIG(*) AS __RowCount
        FROM [dbo].[BaseTbl] t
        GROUP BY
            t.Epoch / 86400,
            CustomerID,
            TypeID;
GO

CREATE UNIQUE CLUSTERED INDEX IX_ixvw_AggTbl
    ON [dbo].[ixvw_AggTbl](EpochDay, CustomerID, TypeID);

Unfortunately, you can't go farther and convert the EpochDay column to an actual date within the indexed view because DATEADD is non-deterministic (see Aaron's comment below for why), so you'd have to convert it in the actual SELECT query against the view. But that's not too difficult.

In any event, as I said before, I'm not sure how useful this would be for your specific application.