Sql-server – How to count repeated rows taking into account other rows in the middle

sql serversql-server-2008-r2

For SQL server 2008 R2
It seems challenging to come up with a single query to the following:
Example given columns a & b:

a |   b
-------------
0 |  2000
1 |  2001
1 |  2002
1 |  2003
2 |  2004
3 |  2005
1 |  2006
1 |  2007
4 |  2008
1 |  2009

Goal: Mark rows with repeated column a and give them unique number taking into account other values in between. Result should be in column c. Note the most difficult part here is to populate column c with 2 & 5 & 7.

a |  b   |  c
-------------
0 |  2000 | 1
1 |  2001 | 2
1 |  2002 | 2
1 |  2003 | 2
2 |  2004 | 3
3 |  2005 | 4
1 |  2006 | 5
1 |  2007 | 5
4 |  2008 | 6
1 |  2009 | 7

Best Answer

This is a problem. One (of the many) ways to solve it (this requires 2012+ versions):

WITH 
  t AS
    ( SELECT a, b, x = CASE WHEN a = LAG(a) OVER (ORDER BY b) 
                           THEN NULL ELSE 1 
                       END
      FROM table_name
    )
SELECT a, b, c = COUNT(x) OVER (ORDER BY b) 
FROM t 
ORDER BY b ;

This should work in 2005 and above:

WITH 
  t AS
    ( SELECT a, b, dx = ROW_NUMBER() OVER (ORDER BY b) 
                        - ROW_NUMBER() OVER (PARTITION BY a ORDER BY b) 
      FROM table_name
    ),
  tt AS
    ( SELECT a, b, mb = MIN(b) OVER (PARTITION BY a, dx)
      FROM t 
    )
SELECT a, b, c = DENSE_RANK() OVER (ORDER BY mb)
FROM tt 
ORDER BY b ;