Mysql – Getting the query right

group-concatenationMySQL

CREATE TABLE `Action` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `created_By_Id` int(10) unsigned NOT NULL,
  `name` varchar(60) NOT NULL,
  `created_At` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `notes` varchar(400) DEFAULT NULL,
  PRIMARY KEY (`id`),

  KEY `action_created_By_Id_fk` (`created_By_Id`),

  CONSTRAINT `action_created_By_Id_fk` FOREIGN KEY (`created_By_Id`) REFERENCES `Employee` (`id`) ON DELETE CASCADE,

) ENGINE=InnoDB AUTO_INCREMENT=502004 DEFAULT CHARSET=latin1


CREATE TABLE `ActionAssignedTo` (
  `action_Id` int(10) unsigned DEFAULT NULL,
  `assignee_Id` int(10) unsigned DEFAULT NULL,
 -- KEY `actionassignedto_assignee_id_foreign` (`assignee_Id`),
 -- replaced by:
 **UNIQUE KEY `actionassignedto_action_id_assignee_id_unique` (`action_Id`,`assignee_Id`),**

  KEY `actionassignedto_action_id_foreign` (`action_Id`),
  CONSTRAINT `ActionAssignedTo_ibfk_1` FOREIGN KEY (`assignee_Id`) REFERENCES `Employee` (`id`) ON DELETE CASCADE,
  CONSTRAINT `ActionAssignedTo_ibfk_2` FOREIGN KEY (`action_Id`) REFERENCES `Action` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `Employee` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `vendor_Id` int(10) unsigned DEFAULT NULL,
  `name` varchar(40) NOT NULL,     
  PRIMARY KEY (`id`),
  KEY `employee_vendor_id_foreign` (`vendor_Id`),
  CONSTRAINT `employee_vendor_id_foreign` FOREIGN KEY (`vendor_Id`) REFERENCES `Vendor` (`vendor_Id`)
) ENGINE=InnoDB AUTO_INCREMENT=511 DEFAULT CHARSET=latin1

CREATE TABLE `ActionComment` (
  `comment_Id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `commentor_Id` int(10) unsigned NOT NULL,
  `action_Id` int(10) unsigned NOT NULL,
  `comment` varchar(255) DEFAULT NULL,     
  `created_ActionComment_At` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`comment_Id`),
  KEY `actioncomment_commentor_id_foreign` (`commentor_Id`),
  KEY `actioncomment_action_id_foreign` (`action_Id`),
  CONSTRAINT `actioncomment_action_id_foreign` FOREIGN KEY (`action_Id`) REFERENCES `Action` (`id`) ON DELETE CASCADE,
  CONSTRAINT `actioncomment_commentor_id_foreign` FOREIGN KEY (`commentor_Id`) REFERENCES `Employee` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=latin1

Now I need to fetch the Action with id = 1. As Business rule I also need to verify if that Action is assigned to me(empId = 2) only then I can fetch the Action.

Below is the query :

select      Action.name,  
            group_concat(AssigneeNameTable.name) as assignedTo, 
            group_concat( AssigneeNameTable.id) as assignedToId, 
            concat( '[', 
              group_concat( 
                 json_object(
                   'comment', ActionComment.comment, 
                   'comDate', ActionComment.created_ActionComment_At)
                    order by ActionComment.created_ActionComment_At asc)
            , ']')as comments,

from        Action
inner join  Employee 
on          Action.created_By_Id = Employee.id 
and         Employee.vendor_Id in (select vendor_Id 
                                   from Employee 
                                   where vendor_Id = 2 
                                   and id = 2)

inner join ActionAssignedTo
on         Action.id = ActionAssignedTo.action_Id 
and        ActionAssignedTo.action_Id = 1 
and        ActionAssignedTo.assignee_Id = 2 // Business rule check

//we take inner join again on ActionAssignedTo so we get other assignees for the action
inner join ActionAssignedTo as ATable 
on         ActionAssignedTo.action_Id = ATable.action_Id

inner join Employee as AssigneeNameTable 
on         ATable.assignee_Id = AssigneeNameTable.Id

left join ActionComment 
on        Action.id = ActionComment.action_Id 

where    Action.deleted_At is null
group by Action.id 

Now, with this query : since there are multiple assignees and multiple comments for an action the result returned (partial output)

|assignedTo                | assignedToId|       comments|

| A,B,A,B,A,B              | 2,4,2,4,2,4 |        {// group of comments repeated 3 times}

To avoid the problem of repetition we use DISTINCT before each group concat
and we did get the results as expected

|assignedTo                | assignedToId|       comments|

| A,B                      | 2,4         |        {// group of comments}

We were curious to know if we can write the query in a more efficient way
and if its a good idea to use Distinct with group_concat


Pretty Code from Rick James

          EXPLAIN FORMAT=JSON select  A.name,
           group_concat(E1.name) as assignedTo,
           group_concat( E1.id) as assignedToId,
           (select concat( '[', group_concat(
                  json_object( 'comment', AC.comment, 'comDate',
                               AC.created_ActionComment_At)
                       order by  AC.created_ActionComment_At asc) , ']'
                 ) from ActionComment as AC where AC.action_Id = 1 ) as comments
       from Action AS A
       inner join Employee AS E  ON A.created_By_Id = E.id
         and  E.vendor_Id in (
           SELECT  vendor_Id
               from Employee
               where  vendor_Id = 2
                 and  id = 2 )
       inner join ActionAssignedTo AS To1  ON A.id = To1.action_Id
         and  To1.action_Id = 2
         and  To1.assignee_Id = 2
       inner join ActionAssignedTo AS To2   ON To1.action_Id = To2.action_Id
       inner join Employee AS E1  ON To2.assignee_Id = E1.Id
       where  A.deleted_At is null
       group by  A.id

Output :

    {
      "query_block": {
        "select_id": 1,
        "cost_info": {
          "query_cost": "3.80"
        },
        "grouping_operation": {
          "using_filesort": false,
          "nested_loop": [
            {
              "table": {
                "table_name": "A",
                "access_type": "const",
                "possible_keys": [
                  "PRIMARY",
                  "action_created_by_id_foreign",
                  "deletedAtIndex"
                ],
                "key": "PRIMARY",
                "used_key_parts": [
                  "id"
                ],
                "key_length": "4",
                "ref": [
                  "const"
                ],
                "rows_examined_per_scan": 1,
                "rows_produced_per_join": 1,
                "filtered": "100.00",
                "cost_info": {
                  "read_cost": "0.00",
                  "eval_cost": "0.20",
                  "prefix_cost": "0.00",
                  "data_read_per_join": "512"
                },
                "used_columns": [
                  "id",
                  "created_By_Id",
                  "name",
                  "deleted_At"
                ]
              }
            },
            {
              "table": {
                "table_name": "E",
                "access_type": "const",
                "possible_keys": [
                  "PRIMARY",
                  "employee_vendor_id_foreign"
                ],
                "key": "PRIMARY",
                "used_key_parts": [
                  "id"
                ],
                "key_length": "4",
                "ref": [
                  "const"
                ],
                "rows_examined_per_scan": 1,
                "rows_produced_per_join": 1,
                "filtered": "100.00",
                "cost_info": {
                  "read_cost": "0.00",
                  "eval_cost": "0.20",
                  "prefix_cost": "0.00",
                  "data_read_per_join": "560"
                },
                "used_columns": [
                  "id",
                  "vendor_Id"
                ]
              }
            },
            {
              "table": {
                "table_name": "To1",
                "access_type": "const",
                "possible_keys": [
                  "actionassignedto_action_id_assignee_id_unique",
                  "actionassignedto_assignee_id_foreign"
                ],
                "key": "actionassignedto_action_id_assignee_id_unique",
                "used_key_parts": [
                  "action_Id",
                  "assignee_Id"
                ],
                "key_length": "10",
                "ref": [
                  "const",
                  "const"
                ],
                "rows_examined_per_scan": 1,
                "rows_produced_per_join": 1,
                "filtered": "100.00",
                "using_index": true,
                "cost_info": {
                  "read_cost": "0.00",
                  "eval_cost": "0.20",
                  "prefix_cost": "0.00",
                  "data_read_per_join": "16"
                },
                "used_columns": [
                  "action_Id",
                  "assignee_Id"
                ]
              }
            },
            {
              "table": {
                "table_name": "Employee",
                "access_type": "const",
                "possible_keys": [
                  "PRIMARY",
                  "employee_vendor_id_foreign"
                ],
                "key": "PRIMARY",
                "used_key_parts": [
                  "id"
                ],
                "key_length": "4",
                "ref": [
                  "const"
                ],
                "rows_examined_per_scan": 1,
                "rows_produced_per_join": 1,
                "filtered": "100.00",
                "cost_info": {
                  "read_cost": "0.00",
                  "eval_cost": "0.20",
                  "prefix_cost": "0.00",
                  "data_read_per_join": "560"
                },
                "used_columns": [
                  "id",
                  "vendor_Id"
                ]
              }
            },
            {
              "table": {
                "table_name": "To2",
                "access_type": "ref",
                "possible_keys": [
                  "actionassignedto_action_id_assignee_id_unique",
                  "actionassignedto_assignee_id_foreign"
                ],
                "key": "actionassignedto_action_id_assignee_id_unique",
                "used_key_parts": [
                  "action_Id"
                ],
                "key_length": "5",
                "ref": [
                  "const"
                ],
                "rows_examined_per_scan": 2,
                "rows_produced_per_join": 2,
                "filtered": "100.00",
                "using_index": true,
                "cost_info": {
                  "read_cost": "1.00",
                  "eval_cost": "0.40",
                  "prefix_cost": "1.40",
                  "data_read_per_join": "32"
                },
                "used_columns": [
                  "action_Id",
                  "assignee_Id"
                ],
                "attached_condition": "(`S1`.`To2`.`assignee_Id` is not null)"
              }
            },
            {
              "table": {
                "table_name": "E1",
                "access_type": "eq_ref",
                "possible_keys": [
                  "PRIMARY"
                ],
                "key": "PRIMARY",
                "used_key_parts": [
                  "id"
                ],
                "key_length": "4",
                "ref": [
                  "S1.To2.assignee_Id"
                ],
                "rows_examined_per_scan": 1,
                "rows_produced_per_join": 2,
                "filtered": "100.00",
                "cost_info": {
                  "read_cost": "2.00",
                  "eval_cost": "0.40",
                  "prefix_cost": "3.80",
                  "data_read_per_join": "1K"
                },
                "used_columns": [
                  "id",
                  "name"
                ]
              }
            }
          ],
          "select_list_subqueries": [
            {
              "dependent": false,
              "cacheable": true,
              "query_block": {
                "select_id": 2,
                "cost_info": {
                  "query_cost": "2.40"
                },
                "table": {
                  "table_name": "AC",
                  "access_type": "ALL",
                  "possible_keys": [
                    "actioncomment_action_id_foreign"
                  ],
                  "rows_examined_per_scan": 7,
                  "rows_produced_per_join": 7,
                  "filtered": "100.00",
                  "cost_info": {
                    "read_cost": "1.00",
                    "eval_cost": "1.40",
                    "prefix_cost": "2.40",
                    "data_read_per_join": "3K"
                  },
                  "used_columns": [
                    "action_Id",
                    "comment",
                    "created_ActionComment_At"
                  ],
                  "attached_condition": "(`S1`.`AC`.`action_Id` = 1)"
                }
              }
            }
          ]
        }
      }
    }

Best Answer

You seem to have 2 questions.

  • Is there something better than DISTINCT in GROUP_CONCAT()? No. Use it when you need it.

  • Can the query be sped up or otherwise improved?

Don't use IN ( SELECT ... ), use EXISTS or JOIN.

JOIN + GROUP BY smells like the inefficient "explode-implode" pattern. First you build a large set or rows (from the joins), then you collapse them back to one row per row in the first table (via group by). It is often grossly inefficient, leads to inflated COUNT() and SUM(), and in your case the need for DISTINCT.

Indexes:

ActionAssignedTo:  (action_Id, assignee_Id) -- in this order; smells like the PK?

ActionAssignedTo smells like a many-to-many table. It really needs PRIMARY KEY(action_Id, assignee_Id). This, alone, may help performance. More discussion

What do you want to happen if there is nothing in ActionComment for a given action_Id? Seems like the JSON will be mangled. If not, get rid of the LEFT JOIN and change

concat( '[', ... ) as comments

into

( SELECT concat( '[', ... ) FROM ActionComment ) as comments

If there is a possibility of no rows, tack on an IFNULL or something to avoid a mess.

Please provide EXPLAIN SELECT ...

Before my suggested changes, here is a prettyprint of the code:

select  A.name,
        group_concat(E.name) as assignedTo,
        group_concat( E.id) as assignedToId,
        concat( '[', group_concat(
               json_object( 'comment', AC.comment, 'comDate',
                            AC.created_ActionComment_At)
                    order by  AC.created_ActionComment_At asc) , ']'
              ) as comments,
    from  Action AS A
    inner join  Employee AS E  ON A.created_By_Id = E.id
      and  E.vendor_Id in (
        SELECT  vendor_Id
            from  Employee
            where  vendor_Id = 2
              and  id = 2 )
    inner join  ActionAssignedTo AS To1  ON A.id = To1.action_Id
      and  To1.action_Id = 1
      and  To1.assignee_Id = 2 // Business rule check
    inner join  ActionAssignedTo AS To2 as  ON To1.action_Id = To2.action_Id
    inner join  Employee AS E as AssigneeNameTable AS E  ON To2.assignee_Id = E.Id
    left join  ActionComment AS AC  ON A.id = AC.action_Id
    where  A.deleted_At is null
    group by  A.id 

There may be more to improve; this is rather complex.

More

Shouldn't both of the columns in ActionAssignedTo be NOT NULL??