PostgreSQL – Write Window Function to Increment on Discrete Values

postgresqlwindow functions

Say I want to do something like:

select 
     my_date,
     row_number() over(partition by my_date order by my_date asc)
from
     my_table

I want the row numbers to come out as:

2018-10-01,1
2018-10-01,1
2018-10-02,2
2018-10-03,3

Not quite sure how to do this. With the current window function, it does:

2018-10-01,1
2018-10-01,2
2018-10-02,1
2018-10-03,1

Best Answer

You want dense_rank() instead of row_number()

select my_date,
       dense_rank() over(order by my_date asc)
from my_table

Related Solutions

Sql-server – How to write windowing query which sums a column to create discrete buckets

I am not sure what type of performance you are looking for, but if CLR or external app is not an option, a cursor is all that is left. On my aged laptop I get through 1,000,000 rows in about 100 seconds using the following solution. The nice thing about it is that it scales linearly, so I would be looking at a little about 20 minutes to run through the entire thing. With a decent server you will be faster, but not an order of magnitude, so it would still take several minutes to complete this. If this is a one off process, you probably can afford the slowness. If you need to run this as a report or similar regularly, you might want to store the values in the same table un update them as new rows get added, e.g. in a trigger.

Anyway, here is the code:

IF OBJECT_ID('dbo.MyTable') IS NOT NULL DROP TABLE dbo.MyTable;

CREATE TABLE dbo.MyTable(
 Id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
 v NUMERIC(5,3) DEFAULT ABS(CHECKSUM(NEWID())%100)/100.0
);


MERGE dbo.MyTable T
USING (SELECT TOP(1000000) 1 X FROM sys.system_internals_partition_columns A,sys.system_internals_partition_columns B,sys.system_internals_partition_columns C,sys.system_internals_partition_columns D)X
ON(1=0)
WHEN NOT MATCHED THEN
INSERT DEFAULT VALUES;

--SELECT * FROM dbo.MyTable

DECLARE @st DATETIME2 = SYSUTCDATETIME();
DECLARE cur CURSOR FAST_FORWARD FOR
  SELECT Id,v FROM dbo.MyTable
  ORDER BY Id;

DECLARE @id INT;
DECLARE @v NUMERIC(5,3);
DECLARE @running_total NUMERIC(6,3) = 0;
DECLARE @bucket INT = 1;

CREATE TABLE #t(
 id INT PRIMARY KEY CLUSTERED,
 v NUMERIC(5,3),
 bucket INT,
 running_total NUMERIC(6,3)
);

OPEN cur;
WHILE(1=1)
BEGIN
  FETCH NEXT FROM cur INTO @id,@v;
  IF(@@FETCH_STATUS <> 0) BREAK;
  IF(@running_total + @v > 1)
  BEGIN
    SET @running_total = 0;
    SET @bucket += 1;
  END;
  SET @running_total += @v;
  INSERT INTO #t(id,v,bucket,running_total)
  VALUES(@id,@v,@bucket, @running_total);
END;
CLOSE cur;
DEALLOCATE cur;
SELECT DATEDIFF(SECOND,@st,SYSUTCDATETIME());
SELECT * FROM #t;

GO 
DROP TABLE #t;

It drops and recreates the table MyTable, fills it with 1000000 rows and then goes to work.

The cursor copies each row into a temp table while running the calculations. At the end the select returns the calculated results. You might be a little faster if you don't copy the data around but do an in-place update instead.

If you have an option to upgrade to SQL 2012 you can look at the new window-spool supported moving window aggregates, that should give you better performance.

On a side note, if you have an assembly installed with permission_set=safe, you can do more bad stuff to a server with standard T-SQL than with the assembly, so I would keep working on removing that barrier - You have a good use case here where CLR really would help you.

SQL Server 2008 Error – Parallel Data Warehouse Features Not Enabled

The Parallel Data Warehouse (PDW) features are not enabled.

This is a parser bug that exists only in SQL Server 2008. Non-PDW versions of SQL Server before 2012 do not support the ORDER BY clause with aggregate functions like MIN:

Books Online extract

Windowing function support was considerably extended in 2012, compared with the basic implementation available starting with SQL Server 2005. The extensions were made available in Parallel Data Warehouse before being incorporated in the box product. Because the various editions share a common code-base, misleading error messages like this are possible.

If you are interested, the call stack when the aggregate is verified by the parser is shown below. Because the aggregate has an OVER clause with ORDER BY, a check for PDW is issued:

Aggregate verification

This check immediately fails with a parser error:

Parser error

Luckily, you do not need an windowed aggregate that supports ORDER BY framing to solve your code problem.

Best Answer

Related Solutions

Sql-server – How to write windowing query which sums a column to create discrete buckets

SQL Server 2008 Error – Parallel Data Warehouse Features Not Enabled

Related Question