Let's say I have two tables that have a one-to-many relationship: People and Attributes. I'm trying to find duplicates based on the first_name, last_name, and all attributes must match exactly.
CREATE TABLE People (Id int,
first_name varchar(100),
last_name varchar(100));
CREATE TABLE Attributes (Id int,
person_id int,
field varchar(100),
field_value varchar(100));
INSERT INTO People VALUES (1, 'John', 'Smith');
INSERT INTO People VALUES (2, 'John', 'Smith');
INSERT INTO People VALUES (3, 'John', 'Smith');
INSERT INTO Attributes VALUES (1, 1, 'HairColor', 'Brown');
INSERT INTO Attributes VALUES (2, 1, 'EyeColor', 'Blue');
INSERT INTO Attributes VALUES (3, 2, 'HairColor', 'Brown');
INSERT INTO Attributes VALUES (4, 2, 'EyeColor', 'Blue');
INSERT INTO Attributes VALUES (5, 3, 'HairColor', 'Blonde');
Which gives us:
id | first_name | last_name
----+------------+-----------
1 | John | Smith
2 | John | Smith
3 | John | Smith
id | person_id | field | field_value
----+-----------+-----------+-------------
1 | 1 | HairColor | Brown
2 | 1 | EyeColor | Blue
3 | 2 | HairColor | Brown
4 | 2 | EyeColor | Blue
5 | 3 | HairColor | Blonde
I'd like a query that will return IDs 1 and 2 from the People table. I can find duplicates within a single table:
select first_name,last_name,count(*) from People
group by first_name,last_name having ( count(*) > 1 );
But I'm having trouble joining the one-to-many table and detecting duplicates across both tables. How can I detect duplicates across tables that have a one-to-many relationship?
Best Answer
One way to do this (check the SQLfiddle):
and another:
Intersect has higher precedence than except/minus, at least in Postgres and SQL Server. Just to be safe, you can use parentheses, to ensure the precedence.