Ok, first off I think you are almost there. From looking at the schema and data one thing I have noticed is that you seem to have overlooked the fact that a friendship is bi-directional. So when you create a friend entry from a request you also need to create one in the other direction as well:
INSERT INTO `default_friend` (`friend_id`, `user_id`, `is_suscriber`, `privacy`, `created_at`, `friend_list_id`, `approved`)
VALUES (1, 2, 1, 0, '2012-08-13 18:16:11', 0, 1);
After you have done that your query should be more like the result you are after. Running this query:
select distinct u.id as `user_id`, u.username, f.id as `friend_id`, f.username as friend, s.*
from default_users as u
left join default_friend as df on df.user_id = u.id
left join default_users as f on f.id = df.friend_id
left join default_status as s on s.user_id = u.id
left join default_comment as c on c.status_id = s.status_id
order by s.status_id;
returns the following result set:
user_id username friend_id friend status_id message created_at privacy user_id is_reply device
1 admin 2 demo 1 dasdasdasdasdasd 2012-08-13 19:45:37 NULL 1 0
2 demo 1 admin 2 dasdasdasdasdasd 2012-08-13 19:46:03 NULL 2 0
1 admin 2 demo 3 dasdsad344hbvnbnhjhgjhjghjhj 2012-08-13 21:54:53 NULL 1 0
Is this anywhere near what you are looking for?
You can add a new column to replies
, call it position
, and fill it with consecutive numbers of replies per thread (the position of the reply in the thread).
For example
id | thread_id | text | position
1 | 1 | .... | 1
2 | 2 | .... | 1
3 | 1 | .... | 2
4 | 1 | .... | 3
5 | 2 | .... | 2
6 | 3 | .... | 1
Further put an index on (thread_id, position, id)
and it allows you to write queries like
SELECT * FROM replies
WHERE thread_id = 1234
AND position BETWEEN 125400 AND 125410
ORDER BY id ASC
which runs fast, since this does not need a full index scan.
You can either update this column in your application, or write a database trigger to do this automatically.
The initial effort is quite high I admit. We used this trick a few years ago on a high write frequented, quite large table, and like I said it cost some effort to get it running, but when the solution was in place, the performance gain was overwhelming.
Best Answer
You only need 2 tables:
Users (user_id (PK), username, password, email, created_at)
Posts (post_id (PK), user_id (IX), subject, body, created_at)
Updates to either table would be done using the PK so they will be very fast. A simple JOIN can be used to retrieve posts of a particular user based on username, or a simple SELECT if the user_id is already cached.
Later on, any new features would be another table, referencing data in the aboves. Let's say you wanted to add a feature that tracked 'Post Likes'. Create a table
'post_likes' (post_id (PK), user_id (PK), created_at)
. Each time a user likes someone else's post, insert to this table.Or allowed people to 'subscribe' to a particular user.
'user_subscribe' (subscribee_id (PK), subscriber_id (PK))
. When user #14 makes a new post,then email all those people of the new post.
Until your "posts" table reaches the ~500-700GB range, you should be fine with this. After that, we can discuss proper sharding.