The first thing that catches my eye is the index setup for friends
.
You have this at the moment:
friends
-------
userid
friendid
primary key (`userid`,`friendid`),
key `friendid` (`friendid`)
When crosschecking for mutual friendship, it could incur a little expense because the userid may be retrieved from the table when traversing the friendid
index. Perhaps you could index as follows:
friends
-------
userid
friendid
primary key (`userid`,`friendid`),
unique key `friendid` (`friendid`,`userid`)
This might remove any need to access the table and search the index only.
Now, in terms of the queries, both of them may improve with the new unique index. Creating the unique index also eliminates the need to insert (A,B)
and (B,A)
into the table because (A,B)
and (B,A)
would be the index anyway. Thus, the second query would not have to gore through the table to see if someone is a friend of someone else because another person initiated the friendship. That way, if the friendship is broken by just one person, there are no orphaned friendships that are one-sided (seems a lot like life these days, doesn't it?)
Your first query looks like it would benefit more from the unique index. Even with millions of rows, locating friends using the indexes only would avoid touching the table. Still, since you did not present a UNION query, I would like to recommend a UNION query:
SET @givenuserid = ?;
SELECT B.name "Friend's Name"
FROM
(
SELECT userid FROM friends WHERE friendid=@givenuserid
UNION
SELECT friendid FROM friends WHERE userid=@givenuserid
) A INNER JOIN user B USING (userid);
This will let you see who are the friends of each userid
To see all friendships, run this:
SELECT A.userid,A.name,B.friendid,C.name
FROM user A
INNER JOIN friends B ON A.userid=B.userid
INNER JOIN user C on B.friendid=C.userid;
First, here is some sample data:
mysql> drop database if exists key_ilyuk;
Query OK, 2 rows affected (0.01 sec)
mysql> create database key_ilyuk;
Query OK, 1 row affected (0.00 sec)
mysql> use key_ilyuk
Database changed
mysql> create table user
-> (
-> userid INT NOT NULL AUTO_INCREMENT,
-> name varchar(20),
-> primary key(userid)
-> ) ENGINE=MyISAM;
Query OK, 0 rows affected (0.04 sec)
mysql> insert into user (name) values
-> ('rolando'),('pamela'),('dominique'),('carlik'),('diamond');
Query OK, 5 rows affected (0.01 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> create table friends
-> (
-> userid INT NOT NULL,
-> friendid INT NOT NULL,
-> primary key (userid,friendid),
-> unique key (friendid,userid)
-> ) ENGINE=MyISAM;
Query OK, 0 rows affected (0.03 sec)
mysql> insert into friends values (1,2),(2,5),(1,3);
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> select * from user;
+--------+-----------+
| userid | name |
+--------+-----------+
| 1 | rolando |
| 2 | pamela |
| 3 | dominique |
| 4 | carlik |
| 5 | diamond |
+--------+-----------+
5 rows in set (0.00 sec)
mysql> select * from friends;
+--------+----------+
| userid | friendid |
+--------+----------+
| 1 | 2 |
| 1 | 3 |
| 2 | 5 |
+--------+----------+
3 rows in set (0.00 sec)
mysql>
Let's look at all the relationships
mysql> SELECT A.userid,A.name,B.friendid,C.name
-> FROM user A
-> INNER JOIN friends B ON A.userid=B.userid
-> INNER JOIN user C on B.friendid=C.userid
-> ;
+--------+---------+----------+-----------+
| userid | name | friendid | name |
+--------+---------+----------+-----------+
| 1 | rolando | 2 | pamela |
| 1 | rolando | 3 | dominique |
| 2 | pamela | 5 | diamond |
+--------+---------+----------+-----------+
3 rows in set (0.00 sec)
mysql>
Let's look at all 5 userid's and see if the relationships are correctly shown
mysql> SET @givenuserid = 1;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT B.name "Friend's Name"
-> FROM
-> (
-> SELECT userid FROM friends WHERE friendid=@givenuserid
-> UNION
-> SELECT friendid FROM friends WHERE userid=@givenuserid
-> ) A INNER JOIN user B USING (userid);
+---------------+
| Friend's Name |
+---------------+
| pamela |
| dominique |
+---------------+
2 rows in set (0.00 sec)
mysql> SET @givenuserid = 2;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT B.name "Friend's Name"
-> FROM
-> (
-> SELECT userid FROM friends WHERE friendid=@givenuserid
-> UNION
-> SELECT friendid FROM friends WHERE userid=@givenuserid
-> ) A INNER JOIN user B USING (userid);
+---------------+
| Friend's Name |
+---------------+
| rolando |
| diamond |
+---------------+
2 rows in set (0.00 sec)
mysql> SET @givenuserid = 3;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT B.name "Friend's Name"
-> FROM
-> (
-> SELECT userid FROM friends WHERE friendid=@givenuserid
-> UNION
-> SELECT friendid FROM friends WHERE userid=@givenuserid
-> ) A INNER JOIN user B USING (userid);
+---------------+
| Friend's Name |
+---------------+
| rolando |
+---------------+
1 row in set (0.01 sec)
mysql> SET @givenuserid = 4;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT B.name "Friend's Name"
-> FROM
-> (
-> SELECT userid FROM friends WHERE friendid=@givenuserid
-> UNION
-> SELECT friendid FROM friends WHERE userid=@givenuserid
-> ) A INNER JOIN user B USING (userid);
Empty set (0.00 sec)
mysql> SET @givenuserid = 5;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT B.name "Friend's Name"
-> FROM
-> (
-> SELECT userid FROM friends WHERE friendid=@givenuserid
-> UNION
-> SELECT friendid FROM friends WHERE userid=@givenuserid
-> ) A INNER JOIN user B USING (userid);
+---------------+
| Friend's Name |
+---------------+
| pamela |
+---------------+
1 row in set (0.00 sec)
mysql>
They all look correct to me.
Now, let's use your second query to see if it matches up...
mysql> SET @givenuserid = 1;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT u.name, f.friendid
-> FROM friends f inner join user u ON ( u.userid = f.friendid )
-> WHERE f.userid = @givenuserid;
+-----------+----------+
| name | friendid |
+-----------+----------+
| pamela | 2 |
| dominique | 3 |
+-----------+----------+
2 rows in set (0.00 sec)
mysql> SET @givenuserid = 2;
Query OK, 0 rows affected (0.01 sec)
mysql> SELECT u.name, f.friendid
-> FROM friends f inner join user u ON ( u.userid = f.friendid )
-> WHERE f.userid = @givenuserid;
+---------+----------+
| name | friendid |
+---------+----------+
| diamond | 5 |
+---------+----------+
1 row in set (0.00 sec)
mysql> SET @givenuserid = 3;
Query OK, 0 rows affected (0.01 sec)
mysql> SELECT u.name, f.friendid
-> FROM friends f inner join user u ON ( u.userid = f.friendid )
-> WHERE f.userid = @givenuserid;
Empty set (0.00 sec)
mysql> SET @givenuserid = 4;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT u.name, f.friendid
-> FROM friends f inner join user u ON ( u.userid = f.friendid )
-> WHERE f.userid = @givenuserid;
Empty set (0.00 sec)
mysql> SET @givenuserid = 5;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT u.name, f.friendid
-> FROM friends f inner join user u ON ( u.userid = f.friendid )
-> WHERE f.userid = @givenuserid;
Empty set (0.00 sec)
mysql>
Why don't match up? That's because I did not load the (B,A)
for every (A,B)
. Let me load the (B,A)
relationships and try your second query again.
mysql> insert into friends values (2,1),(5,2),(3,1);
Query OK, 3 rows affected (0.02 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> SET @givenuserid = 1;
Query OK, 0 rows affected (0.01 sec)
mysql> SELECT u.name, f.friendid
-> FROM friends f inner join user u ON ( u.userid = f.friendid )
-> WHERE f.userid = @givenuserid;
+-----------+----------+
| name | friendid |
+-----------+----------+
| pamela | 2 |
| dominique | 3 |
+-----------+----------+
2 rows in set (0.00 sec)
mysql> SET @givenuserid = 2;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT u.name, f.friendid
-> FROM friends f inner join user u ON ( u.userid = f.friendid )
-> WHERE f.userid = @givenuserid;
+---------+----------+
| name | friendid |
+---------+----------+
| rolando | 1 |
| diamond | 5 |
+---------+----------+
2 rows in set (0.00 sec)
mysql> SET @givenuserid = 3;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT u.name, f.friendid
-> FROM friends f inner join user u ON ( u.userid = f.friendid )
-> WHERE f.userid = @givenuserid;
+---------+----------+
| name | friendid |
+---------+----------+
| rolando | 1 |
+---------+----------+
1 row in set (0.00 sec)
mysql> SET @givenuserid = 4;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT u.name, f.friendid
-> FROM friends f inner join user u ON ( u.userid = f.friendid )
-> WHERE f.userid = @givenuserid;
Empty set (0.00 sec)
mysql> SET @givenuserid = 5;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT u.name, f.friendid
-> FROM friends f inner join user u ON ( u.userid = f.friendid )
-> WHERE f.userid = @givenuserid;
+--------+----------+
| name | friendid |
+--------+----------+
| pamela | 2 |
+--------+----------+
1 row in set (0.00 sec)
mysql>
They still don't match. That's because your second query is only checking one side.
Let's check your first query against every value with only (A,B) and not (B,A):
mysql> SET @givenuserid = 1;
SELECT u.name, f.friendid userid, IF(f.userid = @givenuserid, f.friendid, f.userid) friendid
FROM friends f
inner join user u ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT u.name, f.friendid userid, IF(f.userid = @givenuserid, f.friendid, f.userid) friendid
-> FROM friends f
-> inner join user u ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
-> WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid );
+-----------+--------+----------+
| name | userid | friendid |
+-----------+--------+----------+
| pamela | 2 | 2 |
| dominique | 3 | 3 |
+-----------+--------+----------+
2 rows in set (0.00 sec)
mysql> SET @givenuserid = 2;
FROM friends f
inner join user u ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid );
Query OK, 0 rows affected (0.01 sec)
mysql> SELECT u.name, f.friendid userid, IF(f.userid = @givenuserid, f.friendid, f.userid) friendid
-> FROM friends f
-> inner join user u ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
-> WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid );
+---------+--------+----------+
| name | userid | friendid |
+---------+--------+----------+
| rolando | 2 | 1 |
| diamond | 5 | 5 |
+---------+--------+----------+
2 rows in set (0.00 sec)
mysql> SET @givenuserid = 3;
SELECT u.name, f.friendid userid, IF(f.userid = @givenuserid, f.friendid, f.userid) friendid
FROM friends f
inner join user u ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid );
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT u.name, f.friendid userid, IF(f.userid = @givenuserid, f.friendid, f.userid) friendid
-> FROM friends f
-> inner join user u ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
-> WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid );
+---------+--------+----------+
| name | userid | friendid |
+---------+--------+----------+
| rolando | 3 | 1 |
+---------+--------+----------+
1 row in set (0.00 sec)
mysql> SET @givenuserid = 4;
FROM friends f
inner join user u ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid );
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT u.name, f.friendid userid, IF(f.userid = @givenuserid, f.friendid, f.userid) friendid
-> FROM friends f
-> inner join user u ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
-> WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid );
Empty set (0.01 sec)
mysql> SET @givenuserid = 5;
FROM friends f
Query OK, 0 rows affected (0.00 sec)
inner join user u ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
mysql> SELECT u.name, f.friendid userid, IF(f.userid = @givenuserid, f.friendid, f.userid) friendid
-> FROM friends f
-> inner join user u ON ( u.userid = IF(f.userid = @givenuserid, f.friendid, f.userid))
-> WHERE ( f.userid = @givenuserid or f.friendid = @givenuserid );
+--------+--------+----------+
| name | userid | friendid |
+--------+--------+----------+
| pamela | 5 | 2 |
+--------+--------+----------+
1 row in set (0.00 sec)
mysql>
Your first works well. I am sure it is benefiting from the unique index as I said earlier, but IMHO I think the UNION is simpler. With than unique index, it would appear to be six of one and half dozen of the other in terms of execution and output.
You would have to benchmark your first query against my suggestion UNION and see.
This was a good question you asked today. +1 for your question.
Best Answer
I would store AB and BA. A friendship is really a two-way relationship, each entity is linked to another. Even though intuitively we think of the "friendship" as one link between two people, from a relational point of view it is more like "A has a friend B" and "B has a friend A". Two relationships, two records.