SQL Server – How to Return Rows Where Row Count is Greater Than or Equal to 2

sql serversql-server-2008-r2t-sql

Table has duplicate data in it. I am in need of identifying the data that is duplicated in a structure like such – identifying each storeID that is duplicated by Hambasa

ABC BLUE 2
ABC Green 2
ABC Orange 2
BET Blue 2
BET Green 2
BET Orange 2

This is sample DDL and the query using Row_Number() I attempted. What would be the proper way to get the result I am after? I am actually wanting to delete the duplicates, so that each storeID only has 1 entry for each Hambasa.

    Declare @TMI Table (HAMBASA varchar(10), storeid varchar(10))
Insert Into @TMI (storeid, hambasa) VALUES
('ABC', 'Blue'), ('ABC', 'Green'), ('ABC', 'Orange'),
('ABC', 'Blue'), ('ABC', 'Green'), ('ABC','Orange'),
('NYC', 'Blue'), ('NYC', 'Green'), ('NYC', 'Orange'),
('BET', 'Blue'), ('BET', 'Green'), ('BET', 'Orange'),
('BET', 'Blue'), ('BET', 'Green'), ('BET', 'Orange')


;With pan As
(
    Select HAMBASA, storeid
    ,RN = Row_Number() Over (Partition By storeid Order By HAMBASA ASC)
    FROM @TMI
) 

Best Answer

Where's the primary key?

DELETE records FROM (
    SELECT 
        ROW_NUMBER() OVER (PARTITION BY HAMBASA, storeid ORDER BY HAMBASA) AS HambasaCount
    FROM @TMI) records
WHERE records.HambasaCount > 1