MySQL: unique combination of two columns

MySQL

This is my friends table

userid(int) | friendid(int) | confirmed(tiny int)
--------------------
1 | 2 | 1
2 | 1 | 1
1 | 3 | 1
3 | 1 | 1
1 | 5 | 0
2 | 4 | 1
4 | 2 | 1
4 | 1 | 1
1 | 4 | 1

I have about 200k rows like this.

I want to delete duplicates. For example if userid=1 and friendid=2 then I don't want the row userid=2, friendid=1.

I want unique combinations across the two columns. This means if (1, 2) is there then (2, 1) should not be there and should be deleted.

Then I can make a friend list query like this

select * from friends where (userid=1 or friendid=1) and confirmed=1

My question is how to delete these duplicate rows?

Please also help me understand how to find the mutual friends of 2 users (userid of them may reside in either userid column or friendid in friends table).

Best Answer

You can use a non-standard MySQL construction and delete from a self-join:

delete f1.* 
from friends f1 
join friends f2 
    on f1.userid = f2.friendid 
   and f1.friendid = f2.userid 
   and f1.userid > f2.userid;

when you have cleaned up the table, next step is to prevent this from happening again. The obvious choice would be a CHECK constraint guaranteeing that userid < friendid:

alter table friends add constraint ... check(userid < friendid)

and a unique constraint on these two columns:

alter table friends add constraint ... unique (userid, friendid);

Unfortunately CHECK constraint is not (AFAIK) supported by MySQL. You can however mimic this with generated columns:

alter table friends 
    add leastid int as (least(userid, friendid)) 
    persistent after friendid;
alter table friends 
    add greatestid int as (greatest(userid, friendid)) 
    persistent after friendid;

Now we can add a unique constraint on the generated columns:

alter table friends add constraint ak1_friends 
    unique (leastid, greatestid);

insert into friends (userid, friendid) values (1,2);
ERROR 1062 (23000): Duplicate entry '1-2' for key 'ak1_friends'
insert into friends (userid, friendid) values (2,1);
ERROR 1062 (23000): Duplicate entry '1-2' for key 'ak1_friends'

To find common friends among for example 3 and 5 you can use a query like:

select f1.userid, f1.friendid, f2.userid, f2.friendid 
from friends f1 
join friends f2 
    on f1.userid in (f2.userid, f2.friendid) 
    or f1.friendid in (f2.userid, f2.friendid) 
where 5 in (f1.userid, f1.friendid) 
  and 3 in (f2.userid, f2.friendid);

Related Solutions

Mysql – How to select all online friends

Personally, I would have friends be a one-way association; this would double the length of your friends table but eliminate many places where you would need to duplicate logic depending on whether the current user was listed on the left or the right of each entry.

+------+------+
| uid1 | uid2 |
+------+------+
|    1 |    2 |
|    2 |    1 |
+------+------+

Given a choice between saving RAM/hard drive space, and simpler business logic, the choice should almost always fall to simpler rather than smaller. I suspect that most use-cases will have this performing better as well.

Mysql – SQL DB design to support user feeds (in application like facebook)

Get rid of ambiguous columns like object_id. Use log_type to determine what you are querying on. So something like:

CREATE TABLE logs (
    log_id bigint AUTO_INCREMENT, -- not sure if this is the mysql syntax
    user_id int references users(id),
    post_id bigint references posts(id)
);

A second point is I would suggest changing the friend/unfriend to be immediate so it doesn't hit this table. This should be only for logging actions to posts, not friendships and the like. If you must log and propagate, create a separate table for this.

Best Answer

Related Solutions

Mysql – How to select all online friends

Mysql – SQL DB design to support user feeds (in application like facebook)

Related Question