Sql-server – How to group data and write it’s group id for each row

ranksql-server-2012t-sql

I have sample data:

CREATE TABLE #T (Name varchar(5), GroupId int NULL)
INSERT INTO #T (Name)  VALUES 
('A'),
('A'),
('A'),
('B'),
('B'),
('C'),
('D'),
('D') 

Name  GroupId
----- -----------
A     NULL
A     NULL
A     NULL
B     NULL
B     NULL
C     NULL
D     NULL
D     NULL

How can I group that data by name and after it, write groupId (sequential, maybe identity)? That is what I want to get:

Name  GroupId
----- -----------
A     1
A     1
A     1
B     2
B     2
C     3
D     4
D     4

The table to group has about 15m rows. How to do that better? Thanks!

Best Answer

;WITH cteA
AS(SELECT  Name,GroupID,
        DENSE_RANK () OVER(ORDER BY Name) AS New_GroupID
     FROM #T)


UPDATE  cteA
SET GroupID  = New_GroupID

Now , depending of your system , this could take a few seconds or more. You can split the update , to do in chunks. Something like A-G , then from G to M ... and you can add , in the DENSE_RANK something like MaxGroupID + DENSE_RANK () OVER(ORDER BY Name) AS New_GroupID