Sql-server – Check if query will cause duplicate row

duplicationsql server

I have a database with a primary key on the first column (words)
I want to trim the trailing and leading ends of my data, but I get an error that this causes duplicate data

EX: ' David ' will become 'David' and that is already in the database

Is there anyway to compare what I already have in the database to my Select statement (without spaces) and remove any that will cause a duplicate row?

Best Answer

SELECT Func(PK),count(*)
FROM tab
GROUP BY Func(PK)
HAVING Count(*)>1 ;

Where Func() is whatever you're using to clean up the PK column spaces

Example:

SELECT LTRIM(RTRIM(PK)),count(*)
FROM tab
GROUP BY LTRIM(RTRIM(PK))
HAVING Count(*)>1 ;

Example2: (as suggested by Martin in the comments)

WITH cte AS 
(SELECT ROW_NUM() OVER (PARTITION BY LTRIM(PK) ORDER BY LTRIM(PK)) as rn 
FROM tab) 
DELETE FROM cte where rn > 1;

Related Solutions

Mysql – Need to find duplicate entries

Suppose your table is called ingredients. Try the following:

Step 01) Create an empty delete keys table called ingredients_delete_keys

CREATE TABLE ingredients_delete_keys
SELECT fk,recipe,pkey FROM ingredients WHERE 1=2;

Step 02) Create PRIMARY KEY on ingredients_delete_keys

ALTER TABLE ingredients_delete_keys ADD PRIMARY KEY (fk,recipe,pkey);

Step 03) Index the ingredients table with fk,recipe,pkey

ALTER TABLE ingredients ADD INDEX fk_recipe_pkey_ndx (fk,recipe,pkey);

Step 04) Populate the ingredients_delete_keys table

INSERT INTO ingredients_delete_keys
SELECT fk,recipe,MIN(pkey)
FROM ingredients GROUP BY fk,recipe;

Step 05) Perform a DELETE JOIN on ingredients table using keys that don't match

DELETE B.*
FROM ingredients_delete_keys A
LEFT JOIN ingredients B
USING (fk,recipe,pkey)
WHERE B.pkey IS NULL;

Step 06) Drop the delete keys

DROP TABLE ingredients_delete_keys;

Step 07) Get rid of the fk_recipe_pkey_ndx index

ALTER TABLE ingredients DROP INDEX fk_recipe_pkey_ndx;

OK Here are all the lines in one block...

CREATE TABLE ingredients_delete_keys
SELECT fk,recipe,pkey FROM ingredients WHERE 1=2;
ALTER TABLE ingredients_delete_keys ADD PRIMARY KEY (fk,recipe,pkey);
ALTER TABLE ingredients ADD INDEX fk_recipe_pkey_ndx (fk,recipe,pkey);
INSERT INTO ingredients_delete_keys
SELECT fk,recipe,MIN(pkey)
FROM ingredients GROUP BY fk,recipe;
DELETE B.*
FROM ingredients_delete_keys A
LEFT JOIN ingredients B
USING (fk,recipe,pkey)
WHERE B.pkey IS NULL;
DROP TABLE ingredients_delete_keys;
ALTER TABLE ingredients DROP INDEX fk_recipe_pkey_ndx;

Give it a Try !!!

CAVEAT

Notice that using MIN function helps keep the first pkey entered for fk. If you switch it to MAX function instead, the last pkey entered for fk is kept.

Sql-server – Unique index corrupted SQL. Select query returns single row but create unique index fails

if you consider error, Msg 1505, Level 16, State 1, Line 2 The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name 'dbo.MSmerge_contents' and the index name 'uc1SycContents'. The duplicate key value is (7696031, 08703987-557d-e111-9888-e61f13c44f03)...... I am running below Query
select * from msmerge_contents where rowguid='08703987-557d-e111-9888-e61f13c44f03'
and it is returning only 1 row

When you have index corruption problems (ie. keys present in NC index but not in base table or vice-versa) you must be very careful about the SQL you use to validate data. At this moment your data is inconsistent but the query optimizer does not know that and completely trusts your schema, including these incorrect indexes. As such it may optimize your query to use one of the NC indexes that is missing a key and the result will also miss a a key falsely returning no duplicates. To solve this catch-22 situation you need to force the optimizer hand by explicitly requesting an index or another and make sure the projected list of columns can be satisfied by the index you enforced (ie. no *). Assuming uc1SycContents is not the clustered index, try out the following:

select rowguid
from msmerge_contents with INDEX (1)
where rowguid='08703987-557d-e111-9888-e61f13c44f03';

select rowguid
from msmerge_contents with INDEX ([uc1SycContents])
where rowguid='08703987-557d-e111-9888-e61f13c44f03';

This will forcefully check if the rowguid has a duplicate for that guid in the base table clustered index (index id 1) vs. the index uc1SycContents. I expect that the first query returns 2 (or more) rows while the second returns 1.

Best Answer

Related Solutions

Mysql – Need to find duplicate entries

Sql-server – Unique index corrupted SQL. Select query returns single row but create unique index fails

Related Question