You can eliminate the duplicates from your query by using the ROW_NUMBER()
aggregate:
IF OBJECT_ID('tempdb..#Table') IS NOT NULL
DROP TABLE #table;
CREATE TABLE #Table
(
ID01 INT NOT NULL
, ID02 INT NOT NULL
);
INSERT INTO #Table (ID01, ID02)
VALUES (1, 1)
, (1, 2) --problematic
, (1, 3) --problematic
, (1, 4) --problematic
, (2, 1)
, (2, 1)
, (2, 2) --problematic
, (3, 1)
, (3, 1)
, (4, 1);
;WITH cte AS (
SELECT DISTINCT
t1_ID01 = t1.ID01
, t1_ID02 = t1.ID02
, rn1 = ROW_NUMBER() OVER (PARTITION BY t1.ID01 ORDER BY t1.ID01, T1.ID02)
, rn2 = ROW_NUMBER() OVER (PARTITION BY t1.ID01, t1.ID02 ORDER BY t1.ID01, T1.ID02)
FROM #Table t1
)
SELECT *
FROM cte
WHERE cte.rn1 > 1
AND cte.rn2 = 1;
The first ROW_NUMBER()
function, rn1
, is used to select rows where there are multiple ID02 values for each individual ID01 value. The second ROW_NUMBER()
function, rn2
, is used to preserve the case where ID01 and ID02 have multiple duplicate values, "which our system already corrects for automatically".
That pattern can be leveraged to remove the invalid rows from the source table, by using the DELETE FROM <cte>
syntax:
;WITH cte AS (
SELECT t1.ID01
, t1.ID02
, rn1 = ROW_NUMBER() OVER (PARTITION BY t1.ID01 ORDER BY t1.ID01, T1.ID02)
, rn2 = ROW_NUMBER() OVER (PARTITION BY t1.ID01, t1.ID02 ORDER BY t1.ID01, T1.ID02)
FROM #Table t1
)
DELETE
FROM cte
WHERE cte.rn1 > 1
AND cte.rn2 = 1;
SELECT *
FROM #Table;
The output; first the select
, then the table after problematic rows have been removed:
Best Answer
That can be done with a simple
SELECT
statement.The
count(*)
command is an aggregate function supported by Postgres and gives the number of rows with duplicate values in a column specified by theGROUP BY
clause.To omit the values without any duplicates, add the
HAVING
clause. AHAVING
is similar to aWHERE
but applies to each row being generated to represent the grouping.To get results to sort with most frequent duplicate values at the top, use the
ORDER BY
clause. We reference the column aliascount_
that we created in the first line with theAS
clause.