Mysql – “GROUP BY” in ranges

group byMySQL

Suppose I have a table with a numeric column (lets call it "score").

I'd like to generate a table of counts, that shows how many times scores appeared in each range.

For example:

score range  | number of occurrences
-------------------------------------
   0-9       |        11
  10-19      |        14
  20-29      |         3
   ...       |       ...

In the example I have given explicit ranges. But don't know what range could be possible.

Is there an easy way to set this up? What do you recommend?

Best Answer

SUGGESTED QUERY

SELECT
    CONCAT(A.ndx,' - ',A.ndx+9) "score range",
    IFNULL(B.rowcount ,0) "number of occurrences"
FROM
(
    SELECT 0 ndx UNION SELECT 10 UNION SELECT 20 UNION
    SELECT 30 UNION SELECT 40 UNION SELECT 50 UNION SELECT 60
    UNION SELECT 70 UNION SELECT 80 UNION SELECT 90
) A
LEFT JOIN
(
    SELECT ndx,COUNT(1) rowcount FROM
    (SELECT FLOOR(score/10)*10 ndx FROM yoshi_scores) AA
    GROUP BY ndx
) B USING (ndx);

SAMPLE DATA

use test
DROP TABLE IF EXISTS yoshi_scores;
CREATE TABLE yoshi_scores
(id int not null auto_increment,
score int not null,primary key (id), key (score));
INSERT INTO yoshi_scores (score) VALUES
(97),(74),(22),(98),(65),(62),(47),(64),(82),( 8),(60),(12),
(27),(14),(13),(28),(60),(12),(27),(34),(32),(89),(15),( 2);

Data would look like this

mysql> SELECT * FROM yoshi_scores ORDER BY id;
+----+-------+
| id | score |
+----+-------+
|  1 |    97 |
|  2 |    74 |
|  3 |    22 |
|  4 |    98 |
|  5 |    65 |
|  6 |    62 |
|  7 |    47 |
|  8 |    64 |
|  9 |    82 |
| 10 |     8 |
| 11 |    60 |
| 12 |    12 |
| 13 |    27 |
| 14 |    14 |
| 15 |    13 |
| 16 |    28 |
| 17 |    60 |
| 18 |    12 |
| 19 |    27 |
| 20 |    34 |
| 21 |    32 |
| 22 |    89 |
| 23 |    15 |
| 24 |     2 |
+----+-------+
24 rows in set (0.00 sec)

SUGGESTED QUERY EXECUTED

mysql> SELECT
    ->     CONCAT(A.ndx,' - ',A.ndx+9) "score range",
    ->     IFNULL(B.rowcount ,0) "number of occurrences"
    -> FROM
    -> (
    ->     SELECT 0 ndx UNION SELECT 10 UNION SELECT 20 UNION
    ->     SELECT 30 UNION SELECT 40 UNION SELECT 50 UNION SELECT 60
    ->     UNION SELECT 70 UNION SELECT 80 UNION SELECT 90
    -> ) A
    -> LEFT JOIN
    -> (
    ->     SELECT ndx,COUNT(1) rowcount FROM
    ->     (SELECT FLOOR(score/10)*10 ndx FROM yoshi_scores) AA
    ->     GROUP BY ndx
    -> ) B USING (ndx);
+-------------+-----------------------+
| score range | number of occurrences |
+-------------+-----------------------+
| 0 - 9       |                     2 |
| 10 - 19     |                     5 |
| 20 - 29     |                     4 |
| 30 - 39     |                     2 |
| 40 - 49     |                     1 |
| 50 - 59     |                     0 |
| 60 - 69     |                     5 |
| 70 - 79     |                     1 |
| 80 - 89     |                     2 |
| 90 - 99     |                     2 |
+-------------+-----------------------+
10 rows in set (0.00 sec)

mysql>

Why is the Query designed this way ??? Look at the first subquery

(
    SELECT 0 ndx UNION SELECT 10 UNION SELECT 20 UNION
    SELECT 30 UNION SELECT 40 UNION SELECT 50 UNION SELECT 60
    UNION SELECT 70 UNION SELECT 80 UNION SELECT 90
) A

I performed a LEFT JOIN of this to the counts for a reason. Notice that the dataset has nothing in the 50 - 59 range. Its count would not show up in the second query:

mysql> SELECT ndx,COUNT(1) rowcount FROM
    -> (SELECT FLOOR(score/10)*10 ndx FROM yoshi_scores) AA
    -> GROUP BY ndx;
+------+----------+
| ndx  | rowcount |
+------+----------+
|    0 |        2 |
|   10 |        5 |
|   20 |        4 |
|   30 |        2 |
|   40 |        1 |
|   60 |        5 |
|   70 |        1 |
|   80 |        2 |
|   90 |        2 |
+------+----------+
9 rows in set (0.00 sec)

mysql>

I am sure you wanted the range 50 - 59 to show up, so the query is designd to catch all ranges. Any missing range is essentially defaulted to zero.

GIVE IT A TRY !!!

CAVEAT: Unfortunately, the first subquery requires you to hardcode it. It will be the only hardcoding required to make it work.