Mysql – Friends relations in MySQL

MySQLrelational-theory

I'm developing a friendship relation in MySQL where friend relation is mutual. If A is friend of B, then B is friend of A. If one of the users end friendship then relation drops. I want to learn which way is better.

I have a running system;

user
-----------
userid p.k
name 

friends
-------
userid
friendid
primary key (`userid`,`friendid`),
key `friendid` (`friendid`)

1 2
2 5
1 3


To get all of my friends;
SELECT u.name, f.friendid , IF(f.userid = $userid, f.friendid, f.userid) friendid 
FROM friends f 
    inner join user u  ON ( u.userid = IF(f.userid = $userid, f.friendid, f.userid)) 
WHERE ( f.userid = '$userid' or f.friendid = '$userid' ) 

This query works well. Maybe I can add a UNION. The Query is more complicated than the one below and the table contains half as many records as the one below.

Another way is to keep relations in separate rows;

1 2
2 1
2 5
5 2
1 3
3 1

SELECT u.name, f.friendid 
FROM friends f inner join user u ON ( u.userid = f.friendid ) 
WHERE f.userid = '$userid'

This query is simple, although the table takes twice as much space.

My concern is; assuming that there are millions of users; which way will work faster?

What are the advantages and disadvantages of both ways?

What should I keep in mind or change for these ways? And what problems can I face for both ways?

Best Answer

The first thing that catches my eye is the index setup for friends.

You have this at the moment:

friends
-------
userid
friendid
primary key (`userid`,`friendid`),
key `friendid` (`friendid`)

When crosschecking for mutual friendship, it could incur a little expense because the userid may be retrieved from the table when traversing the friendid index. Perhaps you could index as follows:

friends
-------
userid
friendid
primary key (`userid`,`friendid`),
unique key `friendid` (`friendid`,`userid`)

This might remove any need to access the table and search the index only.

Now, in terms of the queries, both of them may improve with the new unique index. Creating the unique index also eliminates the need to insert (A,B) and (B,A) into the table because (A,B) and (B,A) would be the index anyway. Thus, the second query would not have to gore through the table to see if someone is a friend of someone else because another person initiated the friendship. That way, if the friendship is broken by just one person, there are no orphaned friendships that are one-sided (seems a lot like life these days, doesn't it?)

Your first query looks like it would benefit more from the unique index. Even with millions of rows, locating friends using the indexes only would avoid touching the table. Still, since you did not present a UNION query, I would like to recommend a UNION query:

SET @givenuserid = ?;
SELECT B.name "Friend's Name"
FROM 
(
    SELECT userid FROM friends WHERE friendid=@givenuserid
    UNION
    SELECT friendid FROM friends WHERE userid=@givenuserid
) A INNER JOIN user B USING (userid);

This will let you see who are the friends of each userid

To see all friendships, run this:

SELECT A.userid,A.name,B.friendid,C.name
FROM user A
INNER JOIN friends B ON A.userid=B.userid
INNER JOIN user C on B.friendid=C.userid;

First, here is some sample data:

mysql> drop database if exists key_ilyuk;
Query OK, 2 rows affected (0.01 sec)

mysql> create database key_ilyuk;
Query OK, 1 row affected (0.00 sec)

mysql> use key_ilyuk
Database changed
mysql> create table user
    -> (
    ->     userid INT NOT NULL AUTO_INCREMENT,
    ->     name varchar(20),
    ->     primary key(userid)
    -> ) ENGINE=MyISAM;
Query OK, 0 rows affected (0.04 sec)

mysql> insert into user (name) values
    -> ('rolando'),('pamela'),('dominique'),('carlik'),('diamond');
Query OK, 5 rows affected (0.01 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> create table friends
    -> (
    ->     userid INT NOT NULL,
    ->     friendid INT NOT NULL,
    ->     primary key (userid,friendid),
    ->     unique key (friendid,userid)
    -> ) ENGINE=MyISAM;
Query OK, 0 rows affected (0.03 sec)

mysql> insert into friends values (1,2),(2,5),(1,3);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from user;
+--------+-----------+
| userid | name      |
+--------+-----------+
|      1 | rolando   |
|      2 | pamela    |
|      3 | dominique |
|      4 | carlik    |
|      5 | diamond   |
+--------+-----------+
5 rows in set (0.00 sec)

mysql> select * from friends;
+--------+----------+
| userid | friendid |
+--------+----------+
|      1 |        2 |
|      1 |        3 |
|      2 |        5 |
+--------+----------+
3 rows in set (0.00 sec)

mysql>

Let's look at all the relationships

mysql> SELECT A.userid,A.name,B.friendid,C.name
    -> FROM user A
    -> INNER JOIN friends B ON A.userid=B.userid
    -> INNER JOIN user C on B.friendid=C.userid
    -> ;
+--------+---------+----------+-----------+
| userid | name    | friendid | name      |
+--------+---------+----------+-----------+
|      1 | rolando |        2 | pamela    |
|      1 | rolando |        3 | dominique |
|      2 | pamela  |        5 | diamond   |
+--------+---------+----------+-----------+
3 rows in set (0.00 sec)

mysql>

Let's look at all 5 userid's and see if the relationships are correctly shown

mysql> SET @givenuserid = 1;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT B.name "Friend's Name"
    -> FROM
    -> (
    ->     SELECT userid FROM friends WHERE friendid=@givenuserid
    ->     UNION
    ->     SELECT friendid FROM friends WHERE userid=@givenuserid
    -> ) A INNER JOIN user B USING (userid);
+---------------+
| Friend's Name |
+---------------+
| pamela        |
| dominique     |
+---------------+
2 rows in set (0.00 sec)

mysql> SET @givenuserid = 2;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT B.name "Friend's Name"
    -> FROM
    -> (
    ->     SELECT userid FROM friends WHERE friendid=@givenuserid
    ->     UNION
    ->     SELECT friendid FROM friends WHERE userid=@givenuserid
    -> ) A INNER JOIN user B USING (userid);
+---------------+
| Friend's Name |
+---------------+
| rolando       |
| diamond       |
+---------------+
2 rows in set (0.00 sec)

mysql> SET @givenuserid = 3;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT B.name "Friend's Name"
    -> FROM
    -> (
    ->     SELECT userid FROM friends WHERE friendid=@givenuserid
    ->     UNION
    ->     SELECT friendid FROM friends WHERE userid=@givenuserid
    -> ) A INNER JOIN user B USING (userid);
+---------------+
| Friend's Name |
+---------------+
| rolando       |
+---------------+
1 row in set (0.01 sec)

mysql> SET @givenuserid = 4;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT B.name "Friend's Name"
    -> FROM
    -> (
    ->     SELECT userid FROM friends WHERE friendid=@givenuserid
    ->     UNION
    ->     SELECT friendid FROM friends WHERE userid=@givenuserid
    -> ) A INNER JOIN user B USING (userid);
Empty set (0.00 sec)

mysql> SET @givenuserid = 5;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT B.name "Friend's Name"
    -> FROM
    -> (
    ->     SELECT userid FROM friends WHERE friendid=@givenuserid
    ->     UNION
    ->     SELECT friendid FROM friends WHERE userid=@givenuserid
    -> ) A INNER JOIN user B USING (userid);
+---------------+
| Friend's Name |
+---------------+
| pamela        |
+---------------+
1 row in set (0.00 sec)

mysql>

They all look correct to me.

Now, let's use your second query to see if it matches up...

mysql> SET @givenuserid = 1;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT u.name, f.friendid
    -> FROM friends f inner join user u ON ( u.userid = f.friendid )
    -> WHERE f.userid = @givenuserid;
+-----------+----------+
| name      | friendid |
+-----------+----------+
| pamela    |        2 |
| dominique |        3 |
+-----------+----------+
2 rows in set (0.00 sec)

mysql> SET @givenuserid = 2;
Query OK, 0 rows affected (0.01 sec)

mysql> SELECT u.name, f.friendid
    -> FROM friends f inner join user u ON ( u.userid = f.friendid )
    -> WHERE f.userid = @givenuserid;
+---------+----------+
| name    | friendid |
+---------+----------+
| diamond |        5 |
+---------+----------+
1 row in set (0.00 sec)

mysql> SET @givenuserid = 3;
Query OK, 0 rows affected (0.01 sec)

mysql> SELECT u.name, f.friendid
    -> FROM friends f inner join user u ON ( u.userid = f.friendid )
    -> WHERE f.userid = @givenuserid;
Empty set (0.00 sec)

mysql> SET @givenuserid = 4;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT u.name, f.friendid
    -> FROM friends f inner join user u ON ( u.userid = f.friendid )
    -> WHERE f.userid = @givenuserid;
Empty set (0.00 sec)

mysql> SET @givenuserid = 5;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT u.name, f.friendid
    -> FROM friends f inner join user u ON ( u.userid = f.friendid )
    -> WHERE f.userid = @givenuserid;
Empty set (0.00 sec)

mysql>

Why don't match up? That's because I did not load the (B,A) for every (A,B). Let me load the (B,A) relationships and try your second query again.

mysql> insert into friends values (2,1),(5,2),(3,1);
Query OK, 3 rows affected (0.02 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SET @givenuserid = 1;
Query OK, 0 rows affected (0.01 sec)

mysql> SELECT u.name, f.friendid
    -> FROM friends f inner join user u ON ( u.userid = f.friendid )
    -> WHERE f.userid = @givenuserid;
+-----------+----------+
| name      | friendid |
+-----------+----------+
| pamela    |        2 |
| dominique |        3 |
+-----------+----------+
2 rows in set (0.00 sec)

mysql> SET @givenuserid = 2;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT u.name, f.friendid
    -> FROM friends f inner join user u ON ( u.userid = f.friendid )
    -> WHERE f.userid = @givenuserid;
+---------+----------+
| name    | friendid |
+---------+----------+
| rolando |        1 |
| diamond |        5 |
+---------+----------+
2 rows in set (0.00 sec)

mysql> SET @givenuserid = 3;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT u.name, f.friendid
    -> FROM friends f inner join user u ON ( u.userid = f.friendid )
    -> WHERE f.userid = @givenuserid;
+---------+----------+
| name    | friendid |
+---------+----------+
| rolando |        1 |
+---------+----------+
1 row in set (0.00 sec)

mysql> SET @givenuserid = 4;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT u.name, f.friendid
    -> FROM friends f inner join user u ON ( u.userid = f.friendid )
    -> WHERE f.userid = @givenuserid;
Empty set (0.00 sec)

mysql> SET @givenuserid = 5;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT u.name, f.friendid
    -> FROM friends f inner join user u ON ( u.userid = f.friendid )
    -> WHERE f.userid = @givenuserid;
+--------+----------+
| name   | friendid |
+--------+----------+
| pamela |        2 |
+--------+----------+
1 row in set (0.00 sec)

mysql>

They still don't match. That's because your second query is only checking one side.

Let's check your first query against every value with only (A,B) and not (B,A):

mysql> SET @givenuserid = 1;
SELECT u.name, f.friendid userid, IF(f.userid = @givenuserid, f.friendid, f.userid) friendid
FROM friends f
    inner join user u  ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT u.name, f.friendid userid, IF(f.userid = @givenuserid, f.friendid, f.userid) friendid
    -> FROM friends f
    ->     inner join user u  ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
    -> WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid  );
+-----------+--------+----------+
| name      | userid | friendid |
+-----------+--------+----------+
| pamela    |      2 |        2 |
| dominique |      3 |        3 |
+-----------+--------+----------+
2 rows in set (0.00 sec)

mysql> SET @givenuserid = 2;
FROM friends f
    inner join user u  ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid  );
Query OK, 0 rows affected (0.01 sec)

mysql> SELECT u.name, f.friendid userid, IF(f.userid = @givenuserid, f.friendid, f.userid) friendid
    -> FROM friends f
    ->     inner join user u  ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
    -> WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid  );
+---------+--------+----------+
| name    | userid | friendid |
+---------+--------+----------+
| rolando |      2 |        1 |
| diamond |      5 |        5 |
+---------+--------+----------+
2 rows in set (0.00 sec)

mysql> SET @givenuserid = 3;
SELECT u.name, f.friendid userid, IF(f.userid = @givenuserid, f.friendid, f.userid) friendid
FROM friends f
    inner join user u  ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid  );
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT u.name, f.friendid userid, IF(f.userid = @givenuserid, f.friendid, f.userid) friendid
    -> FROM friends f
    ->     inner join user u  ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
    -> WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid  );
+---------+--------+----------+
| name    | userid | friendid |
+---------+--------+----------+
| rolando |      3 |        1 |
+---------+--------+----------+
1 row in set (0.00 sec)

mysql> SET @givenuserid = 4;
FROM friends f
    inner join user u  ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid  );
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT u.name, f.friendid userid, IF(f.userid = @givenuserid, f.friendid, f.userid) friendid
    -> FROM friends f
    ->     inner join user u  ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
    -> WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid  );
Empty set (0.01 sec)

mysql> SET @givenuserid = 5;
FROM friends f
Query OK, 0 rows affected (0.00 sec)

    inner join user u  ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
mysql> SELECT u.name, f.friendid userid, IF(f.userid = @givenuserid, f.friendid, f.userid) friendid
    -> FROM friends f
    ->     inner join user u  ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
    -> WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid  );
+--------+--------+----------+
| name   | userid | friendid |
+--------+--------+----------+
| pamela |      5 |        2 |
+--------+--------+----------+
1 row in set (0.00 sec)

mysql>

Your first works well. I am sure it is benefiting from the unique index as I said earlier, but IMHO I think the UNION is simpler. With than unique index, it would appear to be six of one and half dozen of the other in terms of execution and output.

You would have to benchmark your first query against my suggestion UNION and see.

This was a good question you asked today. +1 for your question.