MySQL – Group by Multiple Columns on Multiple Tables with One Result per Group

group byMySQLselect

I have three different tables and I need to count how many rows has a specific value (COUNT(track)) then group by tracks.id and sort by only one result per user (tracks.uid).

My three tables:

`users`
+--------+-------------------+
| `idu`  | `username`        | 
+--------+-------------------+
| 567    | 'TrainingPuppy'   | 
| 687    | 'BathroomMakeover'|
| 45     | 'PoshNachos'      | 
| 15     | 'SewingButtons'   | 
+--------+-------------------+

`views`
+--------+------+---------+
| `id`   | `by` | `track` |
+--------+------+---------+
| 1      | 5    | 55      | 
| 2      | 5    | 55      | 
| 3      | 67   | 55      | 
| 4      | 6    | 29      | 
| 5      | 125  | 2       | 
| 6      | 5    | 698     | 
| 7      | 5    | 698     | 
+--------+------+---------+

`tracks`
+--------+-------+-----------------------+---------------------+
| `id`   | `uid` | `title`               | `time`              |
+--------+-------+-----------------------+---------------------+
| 2      | 15    | 'Worf is in the air'  | 2016-02-11 22:57:35 |
| 29     | 567   | 'Stargold'            | 2016-08-11 22:57:28 |
| 55     | 567   | 'No love liers'       | 2016-10-11 22:57:51 | 
| 698    | 567   | 'Lofe'                | 2016-11-11 22:57:44 |
+--------+-------+-----------------------+---------------------+

So I tried:

SELECT `views`.`track`, `tracks`.*, `users`.*, COUNT(`track`) as `count` 
FROM `views`,`tracks`,`users` 
WHERE `views`.`track` = `tracks`.`id` 
    AND `tracks`.`uid` = `users`.`idu`  
GROUP BY `tracks`.`uid` 
ORDER BY `count` 
DESC LIMIT 0, 20

RESULTS

+--------+-------+---------+---------------------+----------+-----------------------+---------------------+
| `id`   | `uid` | `count` | `username`          | `track`  | `title`               | `time`              |
+--------+-------+---------+---------------------+----------+-----------------------+---------------------+
| 29     | 567   |  6      | 'TrainingPuppy'     | 29       | 'Stargold'            | 2016-10-11 22:57:51 | 
| 2      | 15    |  1      | 'SewingButtons'     | 2        | 'Worf is in the air'  | 2016-02-11 22:57:35 |
+--------+-------+---------+---------------------+----------+-----------------------+---------------------+

Instead I need to select only 1 track (the most counted in views table) of each user.
Something like below:

+--------+-------+---------+---------------------+----------+---------------------+---------------------+
| `id`   | `uid` | `count` | `username`          | `track`  | `title`             | `time`              |
+--------+-------+---------+---------------------+----------+---------------------+---------------------+
| 55     | 567   |  3      | 'TrainingPuppy'     | 55       | 'No love liers'     | 2016-10-11 22:57:51 | 
| 2      | 15    |  1      | 'SewingButtons'     | 2        | 'Worf is in the air'| 2016-02-11 22:57:35 |
+--------+-------+---------+---------------------+----------+---------------------+---------------------+

How can I achieve this?

Best Answer

Setup the tables and some sample data:

drop table if exists users;
drop table if exists views;
drop table if exists tracks;

create table users
(idu      int
,username varchar(30));

create table views
(id       int
,`by`     int
,track    int);

create table tracks
(id       int
,uid      int
,title    varchar(30)
,time     datetime);

insert into users values
(567,'TrainingPuppy'),
(687,'BathroomMakeover'),
( 45,'PoshNachos'),
( 15,'SewingButtons');

insert into views values
(1,  5, 55),
(2,  5, 55),
(3, 67, 55),
(4,  6, 29),
(5,125,  2),
(6,  5,698),
(7,  5,698);

insert into tracks values
(  2, 15,'Worf is in the air','2016-02-11 22:57:35'),
( 29,567,'Stargold'          ,'2016-08-11 22:57:28'),
( 55,567,'No love liers'     ,'2016-10-11 22:57:51'),
(698,567,'Lofe'              ,'2016-11-11 22:57:44');

Start with a (ANSI standard group by) query to obtain our counts:

select tracks.id,
       tracks.uid,
       count(*) as ucount,
       users.username,
       views.track,
       tracks.title,
       tracks.time

from   tracks
join   views
on     views.track = tracks.id

join   users
on     users.idu = tracks.uid

group by tracks.id,
         tracks.uid,
         users.username,
         views.track,
         tracks.title,
         tracks.time

We'll call the above 'user_counts', and create a query that generates a ranking over username ordered by ucount desc:

select id,
       uid,
       ucount,
       username,
       track,
       title,
       time,
       @user_rank := if(@current_username = username, @user_rank:=@user_rank+1, 1) as user_rank,
       @current_username := username

from   user_counts,

        -- initialize our variables
join   (select @user_rank := 0, @current_username := 'abcdefg') r

order by username, ucount desc

Then wrap this in our final select, making sure to only display those rows with a ranking of '1' ... pulling it all together into a single query we have:

select id,
       uid,
       ucount as 'count',
       username,
       track,
       title,
       time

from   (select id,
               uid,
               ucount,
               username,
               track,
               title,
               time,
               @user_rank        := if(@current_username = username, @user_rank:=@user_rank+1, 1) as user_rank,
               @current_username := username

        from   (select tracks.id,
                       tracks.uid,
                       count(*) as ucount,
                       users.username,
                       views.track,
                       tracks.title,
                       tracks.time

                from    tracks
                join    views
                on      views.track = tracks.id

                join    users
                on      users.idu = tracks.uid

                group by tracks.id,
                         tracks.uid,
                         users.username,
                         views.track,
                         tracks.title,
                         tracks.time) user_counts

                -- initialize our variables
        join   (select @user_rank := 0, @current_username := 'abcdefg') r

        order by username, ucount desc) dt

where   user_rank = 1

order by ucount desc, username

And the results:

id uid count username      track title              time
-- --- ----- ------------- ----- ------------------ -------------------
55 567     3 TrainingPuppy    55 No love liers      2016-10-11 22:57:51
 2  15     1 SewingButtons     2 Worf is in the air 2016-02-11 22:57:35

Some fiddles:

NOTE: If a given username has the same max(count) for more than one track/title, this solution will only display one record; to display multiple records would require a tweak to ensure said records receive the same ranking (ie, user_rank = 1).

SUGGESTION #1

You should store the timestamp of the minute and index it

ALTER TABLE mytable ADD COLUMN MinuteTimeStamp AFTER TimeStamp;
UPDATE mytable SET MinuteTimeStamp = TimeStamp - MOD(TimeStamp,60);
ALTER TABLE mytable ADD INDEX MinuteTimeStamp_UniqueKey_ndx (MinuteTimeStamp,UniqueKey);

Then, you can do MIN aggregation on MinuteTimeStamp.

SELECT MinuteTimeStamp,MIN(UniqueKey) UniqueKey
FROM mytable GROUP BY MinuteTimeStamp;

and use it get those records

SELECT B.* FROM
(SELECT MinuteTimeStamp,MIN(UniqueKey) UniqueKey
FROM mytable GROUP BY MinuteTimeStamp) A
INNER JOIN mytable B USING (UniqueKey);

It was tactfully pointed out that triggers would degrade performance

Perhaps doing INSERTs like this may help

INSERT INTO mytable (UniqueKey,TimeStamp,MinuteTimeStamp) VALUES
(
    uniquevalue,
    UNIX_TIMESTAMP(NOW()),
    UNIX_TIMESTAMP(NOW() - INTERVAL SECOND(NOW()) SECOND)
);

SUGGESTION #2

Since you have over 1000 columns (Ugh), perhaps a table of those minute timestamps would be better.

CREATE TABLE MinuteKeys
(
    MinuteTimeStamp INT UNSIGNED NOT NULL,
    UniqueKey INT UNSIGNED NOT NULL,
    PRIMARY KEY (UniqueKey)
    KEY MinuteTimeStamp_UniqueKey_ndx (MinuteTimeStamp,UniqueKey)
) ENGINE=MyISAM;
ALTER TABLE MinuteKeys DISABLE KEYS;
INSERT INTO MinuteKeys SELECT TimeStamp - MOD(TimeStamp,60),UniqueKey FROM mytable;
ALTER TABLE MinuteKeys ENABLE KEYS;

Then, you could use that table for the aggregation

SELECT B.* FROM
(SELECT MinuteTimeStamp,MIN(UniqueKey) UniqueKey
FROM MinuteKeys GROUP BY MinuteTimeStamp) A
INNER JOIN mytable B USING (UniqueKey);

EPILOGUE

Other suggestions are possible but you should really consider normalization of the table

See my post Too many columns in MySQL as to why

Best Answer

Related Solutions

MySQL – Joining Three Tables Using IDs as Foreign Keys

Mysql – query records to fetch by time stamp

SUGGESTION #1

SUGGESTION #2

EPILOGUE

Related Question