SQL Server Running Total from User Defined Function

running-totalssql serversql server 2014t-sqlwindow functions

I have the following query.

Select x1.TotalAmount, [ReturnTaxAmount](x1.TotalAmount,@TaxRate) AS TaxAmount, 
       x2.DocumentType 
FROM table1 X1 
  INNER JOIN table2 x2 on t1.id = t2.t1id

Now how do I get the running total of self-declared TaxAmount? I wish to do something like this

Select x1.TotalAmount, 
       SUM([ReturnTaxAmount](x1.TotalAmount,@TaxRate) AS TaxAmount) OVER (PARTITION BY x3.InvId
          ORDER BY x3.createddate
          ROWS BETWEEN UNBOUNDED PRECEDING
          AND 1 PRECEDING) AS TaxAmount, 
       x2.DocumentType 
FROM table1 X1 
  INNER JOIN table2 x2 on t1.id = t2.t1id

[ReturnTaxAmount] is a function that will return calculated decimal value.

This is the method I tried but failed which suspected is because of the table hasn't really being generated yet.

Is it possible to do so or must I create a new declared/temp table to store the table and select the table again to get the running total of it.

Best Answer

You got the syntax slightly wrong. It should be:

SELECT 
    x1.TotalAmount, 
    SUM([ReturnTaxAmount](x1.TotalAmount,@TaxRate))    -- no AS TaxAmount here
        OVER (PARTITION BY x3.InvId
              ORDER BY x3.createddate
              ROWS BETWEEN UNBOUNDED PRECEDING
              AND 1 PRECEDING) AS TaxAmount,           -- it's ok here
    x2.DocumentType 
FROM table1 x1 
  INNER JOIN table2 x2 ON t1.id = t2.t1id 
--- the rest of the code, that joins x3, etc
  ;

The alias should be only once, after the whole column expression, which includes the SUM(...) OVER (...)

You could use a derived table or a CTE as your idea (no need for temporary table). It would be equivalent:

SELECT 
    t.TotalAmount, 
    SUM(t.TaxAmount)                    -- here we use the alias, defined in 
        OVER (PARTITION BY t.InvId      -- the derived table, below
              ORDER BY t.createddate
              ROWS BETWEEN UNBOUNDED PRECEDING
              AND 1 PRECEDING) AS TaxAmount,  
    t.DocumentType 
FROM 
    ( SELECT 
          x1.TotalAmount, 
          [ReturnTaxAmount](x1.TotalAmount,@TaxRate) AS TaxAmount, 
          x2.DocumentType,
          x3.InvId,
          x3.createddate
      FROM table1 x1 
        INNER JOIN table2 x2 ON t1.id = t2.t1id
      --- the rest of the code, that joins x3, etc
    ) AS t ;

Related Solutions

Sql-server – ON DUPLICATE KEY UPDATE question

If you're using SQL 2008, look at the MERGE statement, it can do what other platforms call UPSERT, UPDATE if it exists, otherwise INSERT.

Sql-server – How to write windowing query which sums a column to create discrete buckets

I am not sure what type of performance you are looking for, but if CLR or external app is not an option, a cursor is all that is left. On my aged laptop I get through 1,000,000 rows in about 100 seconds using the following solution. The nice thing about it is that it scales linearly, so I would be looking at a little about 20 minutes to run through the entire thing. With a decent server you will be faster, but not an order of magnitude, so it would still take several minutes to complete this. If this is a one off process, you probably can afford the slowness. If you need to run this as a report or similar regularly, you might want to store the values in the same table un update them as new rows get added, e.g. in a trigger.

Anyway, here is the code:

IF OBJECT_ID('dbo.MyTable') IS NOT NULL DROP TABLE dbo.MyTable;

CREATE TABLE dbo.MyTable(
 Id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
 v NUMERIC(5,3) DEFAULT ABS(CHECKSUM(NEWID())%100)/100.0
);


MERGE dbo.MyTable T
USING (SELECT TOP(1000000) 1 X FROM sys.system_internals_partition_columns A,sys.system_internals_partition_columns B,sys.system_internals_partition_columns C,sys.system_internals_partition_columns D)X
ON(1=0)
WHEN NOT MATCHED THEN
INSERT DEFAULT VALUES;

--SELECT * FROM dbo.MyTable

DECLARE @st DATETIME2 = SYSUTCDATETIME();
DECLARE cur CURSOR FAST_FORWARD FOR
  SELECT Id,v FROM dbo.MyTable
  ORDER BY Id;

DECLARE @id INT;
DECLARE @v NUMERIC(5,3);
DECLARE @running_total NUMERIC(6,3) = 0;
DECLARE @bucket INT = 1;

CREATE TABLE #t(
 id INT PRIMARY KEY CLUSTERED,
 v NUMERIC(5,3),
 bucket INT,
 running_total NUMERIC(6,3)
);

OPEN cur;
WHILE(1=1)
BEGIN
  FETCH NEXT FROM cur INTO @id,@v;
  IF(@@FETCH_STATUS <> 0) BREAK;
  IF(@running_total + @v > 1)
  BEGIN
    SET @running_total = 0;
    SET @bucket += 1;
  END;
  SET @running_total += @v;
  INSERT INTO #t(id,v,bucket,running_total)
  VALUES(@id,@v,@bucket, @running_total);
END;
CLOSE cur;
DEALLOCATE cur;
SELECT DATEDIFF(SECOND,@st,SYSUTCDATETIME());
SELECT * FROM #t;

GO 
DROP TABLE #t;

It drops and recreates the table MyTable, fills it with 1000000 rows and then goes to work.

The cursor copies each row into a temp table while running the calculations. At the end the select returns the calculated results. You might be a little faster if you don't copy the data around but do an in-place update instead.

If you have an option to upgrade to SQL 2012 you can look at the new window-spool supported moving window aggregates, that should give you better performance.

On a side note, if you have an assembly installed with permission_set=safe, you can do more bad stuff to a server with standard T-SQL than with the assembly, so I would keep working on removing that barrier - You have a good use case here where CLR really would help you.

Best Answer

Related Solutions

Sql-server – ON DUPLICATE KEY UPDATE question

Sql-server – How to write windowing query which sums a column to create discrete buckets

Related Question