I created a random sample of 30 rows based on the table above as follows:
DROP DATABASE IF EXISTS sample;
CREATE DATABASE sample;
USE sample
CREATE TABLE users
(id int not null auto_increment,
users_tbl_points int not null,
users_tbl_rank int not null default 0,
primary key (id),key (users_tbl_points));
INSERT INTO users (users_tbl_points) VALUES
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000));
SET @r = 0; UPDATE users SET users_tbl_rank = @r := (@r + 1) ORDER BY users_tbl_points DESC;
SELECT * FROM users ORDER BY users_tbl_points DESC;
Here is that output
mysql> DROP DATABASE IF EXISTS sample;
CREATE DATABASE sample;
USE sample
CREATE TABLE users
(id int not null auto_increment,
users_tbl_points int not null,
users_tbl_rank int not null default 0,
primary key (id),key (users_tbl_points));
INSERT INTO users (users_tbl_points) VALUES
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000));
SET @r = 0; UPDATE users SET users_tbl_rank = @r := (@r + 1) ORDER BY users_tbl_points DESC;
SELECT * FROM users ORDER BY users_tbl_points DESC;
Query OK, 1 row affected (0.04 sec)
mysql> CREATE DATABASE sample;
Query OK, 1 row affected (0.00 sec)
mysql> USE sample
Database changed
mysql> CREATE TABLE users
-> (id int not null auto_increment,
-> users_tbl_points int not null,
-> users_tbl_rank int not null default 0,
-> primary key (id),key (users_tbl_points));
Query OK, 0 rows affected (0.06 sec)
mysql> INSERT INTO users (users_tbl_points) VALUES
-> (FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
-> (FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
-> (FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
-> (FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
-> (FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
-> (FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
-> (FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
-> (FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
-> (FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),
-> (FLOOR(rand()*1000000)),(FLOOR(rand()*1000000)),(FLOOR(rand()*1000000));
Query OK, 30 rows affected (0.08 sec)
Records: 30 Duplicates: 0 Warnings: 0
mysql> SET @r = 0; UPDATE users SET users_tbl_rank = @r := (@r + 1) ORDER BY users_tbl_points DESC;
Query OK, 0 rows affected (0.00 sec)
Query OK, 30 rows affected (0.05 sec)
Rows matched: 30 Changed: 30 Warnings: 0
mysql> SELECT * FROM users ORDER BY users_tbl_points DESC;
+----+------------------+----------------+
| id | users_tbl_points | users_tbl_rank |
+----+------------------+----------------+
| 27 | 974504 | 1 |
| 3 | 942712 | 2 |
| 26 | 914442 | 3 |
| 23 | 867887 | 4 |
| 13 | 828882 | 5 |
| 12 | 809094 | 6 |
| 29 | 722458 | 7 |
| 14 | 717128 | 8 |
| 4 | 702457 | 9 |
| 1 | 691932 | 10 |
| 5 | 684147 | 11 |
| 8 | 631802 | 12 |
| 9 | 615874 | 13 |
| 7 | 514378 | 14 |
| 17 | 420973 | 15 |
| 24 | 360579 | 16 |
| 16 | 343592 | 17 |
| 22 | 326285 | 18 |
| 20 | 315427 | 19 |
| 6 | 313363 | 20 |
| 21 | 254513 | 21 |
| 30 | 224710 | 22 |
| 25 | 199236 | 23 |
| 10 | 183963 | 24 |
| 28 | 129194 | 25 |
| 19 | 107540 | 26 |
| 15 | 98995 | 27 |
| 18 | 74092 | 28 |
| 11 | 72197 | 29 |
| 2 | 3702 | 30 |
+----+------------------+----------------+
30 rows in set (0.01 sec)
mysql>
Let me update the scores
mysql> UPDATE users SET users_tbl_points = FLOOR(rand()*1000000);
Query OK, 30 rows affected (0.08 sec)
Rows matched: 30 Changed: 30 Warnings: 0
Let me rerank them
mysql> SET @r = 0; UPDATE users SET users_tbl_rank = @r := (@r + 1) ORDER BY users_tbl_points DESC;
Query OK, 0 rows affected (0.00 sec)
Query OK, 29 rows affected (0.06 sec)
Rows matched: 30 Changed: 29 Warnings: 0
Let me display them:
mysql> SELECT * FROM users ORDER BY users_tbl_points DESC;
+----+------------------+----------------+
| id | users_tbl_points | users_tbl_rank |
+----+------------------+----------------+
| 28 | 994943 | 1 |
| 7 | 983596 | 2 |
| 4 | 964909 | 3 |
| 24 | 921117 | 4 |
| 14 | 885399 | 5 |
| 8 | 881325 | 6 |
| 3 | 858830 | 7 |
| 11 | 808514 | 8 |
| 1 | 785523 | 9 |
| 26 | 751842 | 10 |
| 22 | 749931 | 11 |
| 21 | 705158 | 12 |
| 25 | 703038 | 13 |
| 27 | 650093 | 14 |
| 15 | 649229 | 15 |
| 10 | 635204 | 16 |
| 23 | 634182 | 17 |
| 16 | 589948 | 18 |
| 9 | 455836 | 19 |
| 2 | 443080 | 20 |
| 6 | 345553 | 21 |
| 13 | 259255 | 22 |
| 20 | 258620 | 23 |
| 5 | 248056 | 24 |
| 18 | 240429 | 25 |
| 19 | 195981 | 26 |
| 30 | 137355 | 27 |
| 12 | 136960 | 28 |
| 29 | 24437 | 29 |
| 17 | 2055 | 30 |
+----+------------------+----------------+
30 rows in set (0.00 sec)
mysql>
Looks like it works to me !!!
CAVEAT
Make sure you have an index on the users_tbl_points
column
Here is your original query:
SELECT
user_actions.*,
users.twitter_username,
users.email
FROM
`user_actions`
INNER JOIN users
ON (user_actions.user_id=users.id)
ORDER BY timestamp
DESC LIMIT 0, 30
;
First thing I notice is that you are joining two whole tables. Since you only need twitter_username
and email
from the users
table, you should only join from users
using three columns : id
, twitter_username
and email
.
Second thing is the LIMIT
clause. It is being executed after the join. You should execute it before the join. In your case, you are requesting for the 30 most recent user actions. If you can guarantee that only 30 rows are retreived from user_actions
, the join should operate much faster.
If you read the answer from @DTest, his first two bulletpoints already tell you what's wrong the query because of the actions mysql will take in gathering data from each table. The key is to understand what the temp tables will look like while the query is being processed and where the data will reside (memory or disk).
What you need to do is refactor the query to fool the MySQL Query Optimizer. Force the query to produce smaller temp tables. In most cases, config changes in my.cnf should make a dramamtic difference. In other cases, such as this one, refactoring the query may be sufficient.
Here is my proposed change to your query that should work faster:
SELECT
ua.*,
u.twitter_username,
u.email
FROM
(SELECT * FROM `user_actions`
ORDER BY timestamp DESC LIMIT 30) ua
LEFT JOIN
(SELECT id,twitter_username,email FROM `users`) u
ON (ua.user_id=u.id)
;
Here are reasons for refactoring the query:
REASON #1
If you look at inline table ua
, I retrieve only 30 rows using LIMIT
. This will happen no matter how big the user_actions
table gets. It is already ordered because the ORDER BY timestamp DESC
happens before the LIMIT
.
REASON #2
If you look inline table u
, it has id
,twitter_username
,email
. The id
is needed to implement the join.
REASON #3
I use LEFT JOIN
instead of INNER JOIN
for two(2) reasons:
- Preserve the order of the query based on
ua
- Display all user actions in case the user_id in the
ua
no longer exists in the users
tables.
Doing these things will force the temp tables to be smaller. Nevertheless, you will still need implement bulletpoint #3 from @DTest's answer to preempt having temp tables land on disk.
Best Answer
You can self join to tables for this sort of criteria(at least, I can do this in SQL Server):
Note that we're joing rating back to itself. This way we can use the WHERE clause to find rows in
rating
that are greater in both date and stars to other rows inrating
.You can review the full example I built in SQL Fiddle here.
If you want to add the condition that the reviewer made exactly these 2 reviews of the movie and no more, you can add:
This assumes that the
Rating
table has(rID, mID, ratingDate)
as primary key.