I had the following table that holds phone contacts of users.
CREATE TABLE `contacts_phone` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`playerPhone` varchar(45) NOT NULL,
`friendPhone` varchar(45) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `by_contact` (`playerPhone`,`friendPhone`),
KEY `by_id` (`playerPhone`),
KEY `by_phone` (`friendPhone`)
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;
I built the next query that returns mutual friends of my friends:
SELECT SQL_NO_CACHE phones.*, COUNT(phones.friendPhone) AS mutual_friends
FROM (SELECT cp2.friendPhone
FROM contacts_phone cp1
JOIN contacts_phone cp2 ON cp1.friendPhone= cp2.playerPhone
WHERE cp1.playerPhone = 'PHONE_NUMBER') phones
WHERE phones.friendPhone IN (SELECT friendPhone
FROM contacts_phone
WHERE playerPhone = 'PHONE_NUMBER')
GROUP BY phones.friendPhone
Until this point everything was ok 🙂
As our product grew we have collected 13M rows to this table and had a product requirement to add the users id to this table. So I made the next change:
ALTER TABLE `contacts_phone`
ADD COLUMN `cid` VARCHAR(12) NULL DEFAULT NULL AFTER `id`,
DROP INDEX `by_contact` ,
ADD UNIQUE INDEX `by_contact` (`playerPhone` ASC, `friendPhone` ASC, `cid` ASC),
ADD INDEX `by_cid` (`cid` ASC);
I added the user's id (cid) to the table and filled the missing data. part of the rows didn't have id so I had to use DEFAULT NULL
.
I built a different but pretty similar query but now using cid instead.
SELECT SQL_NO_CACHE phones.*, COUNT(phones.friendPhone) AS mutual_friends
FROM (SELECT cp2.friendPhone
FROM contacts_phone cp1
JOIN contacts_phone cp2 ON cp1.friendPhone = cp2.playerPhone
WHERE cp1.cid = 'USER_ID') phones
WHERE phones.friendPhone IN (SELECT friendPhone
FROM contacts_phone
WHERE cid = 'USER_ID')
GROUP BY phones.friendPhone
I expected to get a similar query duration but the result was that when querying with cid the duration is doubled.
What am I missing out? Do I have something wrong in my design?
Could it be because of the nullable index?
Best Answer
Assuming
UNIQUE(personPhone, friendPhone)
. (UNIQUE
is used to prevent redundant entries.) (in tablecontacts_phone
)Then the number of friends that 'Rami' and 'Rick' have in common is found
And the
UNIQUE
index above serves this query well.Another approach:
This finds all our common friends. Wrap a
SELECT COUNT(*) ( ... ) AS y
to get the count. (This is likely to be slower.)How many different friends Rami's friends have. First, let's decide on some details using this example:
Should the count be:
The desired output is:
Friend of friend
Let's break the problem into 2 steps. First, see if this gives you the list of friend-of-friends, together with their counts:
That should include people that are not directly your friends. So, let's filter those out. One way is thus:
(Other approaches might involve
HAVING
orEXISTS
.)