MySQL – Group_Concat Repeated Values with Different IDs

group-concatenationMySQLselect

I have two tables, first the table Product:

id|category_id
--+-----------
1 | 12345
2 | 12345
3 | 12465

And then a table activity:

id|prod_id|activity_type   |description
--+-------+----------------+-----------
1 | 1     | Initialization | blah
2 | 1     | Finalization   | foo
3 | 2     | Initialization | blah again
4 | 2     | Duplication    | bar
5 | 2     | Finalization   | foobar
6 | 3     | Initialization | blob
7 | 3     | Migration      | A to B
8 | 3     | Migration      | B to C
9 | 3     | Finalization   | fuh

Now I want to retrieve for each type of activity the number of product having at least one of this kind of activity, and also the list of product category. The categories will be repeated in the list for each product of this category. For now I'm using the following query:

SELECT a.activity_type as Activity, COUNT(DISTINCT p.id) as Products,
CONVERT(GROUP_CONCAT(p.category SEPARATOR ',  ') USING utf8) AS Categories
FROM mydb.product p, mydb.activity a
WHERE p.id = a.prod_id
AND a.activity_type <> '' // To not count activities which haven't been correctly initialized
GROUP BY Categories
ORDER BY Products

Now what I await for result is:

Activity       | Products | Categories
---------------+----------+--------------------
Initialization | 3        | 12345, 12345, 12465
Finalization   | 3        | 12345, 12345, 12465
Duplication    | 1        | 12345
Migration      | 1        | 12465

But with this query I get the value '12465, 12465' for Migration. I could I get that a category appears on the list, only for each different product ids, but not for each activity of one type?

Best Answer

First group by both activity_type and prod_id and then another group by activity_type:

SELECT 
    a.activity_type AS Activity, 
    COUNT(DISTINCT p.id) AS Products,
    CONVERT(GROUP_CONCAT(p.category_id SEPARATOR ',  ') USING utf8) 
      AS Categories
FROM 
    product AS p
  JOIN 
    ( SELECT activity_type
           , prod_id
      FROM activity 
      WHERE activity_type <> '' 
      GROUP BY activity_type
             , prod_id
    ) AS a
    ON p.id = a.prod_id
GROUP BY 
    activity_type
ORDER BY 
    Products DESC;

Tested in SQL-Fiddle (thank you @Mr.Radical)

You could also safely replace COUNT(DISTINCT p.id) with COUNT(*) in the above, as for every activity type, there are only distinct product IDs (this is taken care in the internal group by).