This is my friends
table
userid(int) | friendid(int) | confirmed(tiny int)
--------------------
1 | 2 | 1
2 | 1 | 1
1 | 3 | 1
3 | 1 | 1
1 | 5 | 0
2 | 4 | 1
4 | 2 | 1
4 | 1 | 1
1 | 4 | 1
I have about 200k rows like this.
I want to delete duplicates. For example if userid=1
and friendid=2
then I don't want the row userid=2, friendid=1
.
I want unique combinations across the two columns. This means if (1, 2) is there then (2, 1) should not be there and should be deleted.
Then I can make a friend list query like this
select * from friends where (userid=1 or friendid=1) and confirmed=1
My question is how to delete these duplicate rows?
Please also help me understand how to find the mutual friends of 2 users (userid of them may reside in either userid column or friendid in friends table).
Best Answer
You can use a non-standard MySQL construction and delete from a self-join:
when you have cleaned up the table, next step is to prevent this from happening again. The obvious choice would be a CHECK constraint guaranteeing that userid < friendid:
and a unique constraint on these two columns:
Unfortunately CHECK constraint is not (AFAIK) supported by MySQL. You can however mimic this with generated columns:
Now we can add a unique constraint on the generated columns:
To find common friends among for example 3 and 5 you can use a query like: