MySQL: unique combination of two columns

MySQL

This is my friends table

userid(int) | friendid(int) | confirmed(tiny int)
--------------------
1 | 2 | 1
2 | 1 | 1
1 | 3 | 1
3 | 1 | 1
1 | 5 | 0
2 | 4 | 1
4 | 2 | 1
4 | 1 | 1
1 | 4 | 1

I have about 200k rows like this.

I want to delete duplicates. For example if userid=1 and friendid=2 then I don't want the row userid=2, friendid=1.

I want unique combinations across the two columns. This means if (1, 2) is there then (2, 1) should not be there and should be deleted.

Then I can make a friend list query like this

select * from friends where (userid=1 or friendid=1) and confirmed=1

My question is how to delete these duplicate rows?

Please also help me understand how to find the mutual friends of 2 users (userid of them may reside in either userid column or friendid in friends table).

Best Answer

You can use a non-standard MySQL construction and delete from a self-join:

delete f1.* 
from friends f1 
join friends f2 
    on f1.userid = f2.friendid 
   and f1.friendid = f2.userid 
   and f1.userid > f2.userid;

when you have cleaned up the table, next step is to prevent this from happening again. The obvious choice would be a CHECK constraint guaranteeing that userid < friendid:

alter table friends add constraint ... check(userid < friendid)

and a unique constraint on these two columns:

alter table friends add constraint ... unique (userid, friendid);

Unfortunately CHECK constraint is not (AFAIK) supported by MySQL. You can however mimic this with generated columns:

alter table friends 
    add leastid int as (least(userid, friendid)) 
    persistent after friendid;
alter table friends 
    add greatestid int as (greatest(userid, friendid)) 
    persistent after friendid;

Now we can add a unique constraint on the generated columns:

alter table friends add constraint ak1_friends 
    unique (leastid, greatestid);

insert into friends (userid, friendid) values (1,2);
ERROR 1062 (23000): Duplicate entry '1-2' for key 'ak1_friends'
insert into friends (userid, friendid) values (2,1);
ERROR 1062 (23000): Duplicate entry '1-2' for key 'ak1_friends'

To find common friends among for example 3 and 5 you can use a query like:

select f1.userid, f1.friendid, f2.userid, f2.friendid 
from friends f1 
join friends f2 
    on f1.userid in (f2.userid, f2.friendid) 
    or f1.friendid in (f2.userid, f2.friendid) 
where 5 in (f1.userid, f1.friendid) 
  and 3 in (f2.userid, f2.friendid);