SQL Server Sum – SQL Query for Cumulative Sum

running-totalssql serversql-server-2012sum

I am experiencing problems with formulating a (relatively) simple SQL Query (SQL Server 2012 is used). We have a database which counts something up for certain users. Therefore we have a really simple database structure consisting of two tables.

Table users:

PK_User, uniqueidentifier
ID, bigint
Username, nvarchar(128)
CreationTimestamp, datetime

Table data:

PK_Data, uniqueidentifier
FK_User, uniqueidentifier
FK_Reporter, uniqueidentifier
CreationTimestamp, datetime

I am currently working with the following SQL Statement:

SELECT u.Username, COUNT(d.FK_User) AS 'Count', CAST(FLOOR(CAST(d.CreationTimestamp AS float)) AS datetime) AS 'Date'
FROM data d INNER JOIN users u ON u.PK_User = d.FK_User
GROUP BY CAST(FLOOR(CAST(d.CreationTimestamp AS float)) AS datetime), u.Username
ORDER BY CAST(FLOOR(CAST(d.CreationTimestamp AS float)) AS datetime)

which delivers something like this:

User1   5   %Date1%
User2   3   %Date1%
User1   7   %Date2%
User2   1   %Date2%

So I am able to get the sum for each user for each specific day. I would like to sum this sum up, so that User1 gets for the second day 5+7=12 and User2 3+1=4.

How can I achieve this?

Best Answer

You need a "window" aggregate, i.e. an OVER clause in the aggregate. And because the query already has a GROUP BY, the aggregate needed is the SUM() over the COUNT(d.FK_User) you already have:

SELECT 
    u.Username, 
    -- COUNT(d.FK_User) AS UserCount,     -- if you need both counts
    SUM(COUNT(d.FK_User)) OVER (PARTITION BY u.Username
                                ORDER BY CAST(d.CreationTimestamp AS DATE)
                                ROWS BETWEEN UNBOUNDED PRECEDING
                                         AND CURRENT ROW) 
        AS CumulativeUserCount, 
    CAST(d.CreationTimestamp AS DATE) AS CreationDate
FROM data d 
    INNER JOIN users u 
    ON u.PK_User = d.FK_User
GROUP BY 
    CAST(d.CreationTimestamp AS DATE), u.Username
ORDER BY
    CreationDate ;
  • The ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW is not strictly needed. The default frame for window aggregates when there is an ORDER BY inside OVER () is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which would produce the same results as ROWS but is likely to generate a suboptimal execution plan. See Aaron Bertrand's blog article Best approaches for running totals – updated for SQL Server 2012 for details.
  • I've simplified the complicated expression that seems to just strip the time part form the datetime column.
  • Removed the single quotes around the aliases. Aliases and identifiers better not be reserved keywords and not have special characters or spaces. If you do need any of these, then it's preferred to quote them with double quotes "Count" or square brackets [Count].