Thesql group by on table from two derived non-indexable

derived-tablesgroup byMySQL

Working with mysql 5.7. I've got a query that ends up joining two derived tables and grouping on one of the columns. The query without the grouping runs pretty fast…. .5 seconds and returns 15K rows (In production I would expect that to possibly hit 100K + rows). When I do a group on this query it really kills the performance resulting in a query that takes 10x longer. I have tried to reorganize the query to push the group by logic to an earlier step; logically that doesn't work. I've also tried creating a temp table to insert the query's results that takes .5 seconds to run but the inserting into the temp table takes ~5 seconds (and I'm assuming that exactly what is happening with the original query given what the explain says). I've also tried modifying the session setting for the join and sort buffers but that doesn't seem to change anything at all. Does anybody have any advice on what else I could try or what I have tried that I haven't tried "correctly". Do I need to do something else besides

SET SESSION sort_buffer_size = 1024*1024*4;
SET SESSION join_buffer_size = 1024*1024*4;

to get those conf settings to work cause they don't seem to do anything at all to my query performance. I've adjust those values all over the place from 1024*4 to 1024*1024*10

Also, I'm a bit miffed as to why it takes ~5 seconds to add 15K rows to a temp table. I've tried memory engine, adding indexes etc… always takes ~5 seconds.

Here is a very simple boiled down version of the query that may help get the gist of what I'm doing

select group_concat(storeID), fileID 
    from ( select ugfileToStores.storeID, ugfileToStores.fileID
         from ( select usergroupID, storeID from tableOne 
              join ( select fileID, storeID from tableTwo ) tableTwo 
              on tableOne.storeID = tableTwo.storeID ) ugfileToStores
    ) fileToStores 
group by fileID 

Without the group by at the end and the group_concat the query runs in ~.5 seconds… with it it runs in ~5 seconds.

Here is an example of the actual query:

SELECT     sql_no_cache group_concat(usergroup0.storeid),
           filebucket0.filebucketid
FROM       (
                      SELECT     en.id      AS usergroupid,
                                 st.storeid AS storeid,
                                 1          AS assignmentcount
                      FROM       entity en
                      CROSS JOIN
                                 (
                                        SELECT storeid
                                        FROM   masterentity_attachedstores
                                        WHERE  masterentityid = 156825) st
                      WHERE      en.id IN ('156830') ) usergroup0
INNER JOIN
           (
                  SELECT maxout.filebucketid,
                         maxout.storeid
                  FROM   (
                                SELECT filebucketid,
                                       storeid,
                                       entityid
                                FROM   (
                                              SELECT stb.id                      AS filebucketid,
                                                     AS.storeid                  AS storeid,
                                                     1                           AS assignmentcount
                                              FROM   masterentity_attachedstores AS
                                              JOIN   entity en
                                              ON     AS.storeid = en.id
                                              JOIN   filebucket AS stb
                                              WHERE  en.isdeleted = 0
                                              AND    AS.masterentityid = 156825
                                              AND    stb.id IN (55,40)
                                              UNION ALL
                                              SELECT     stb.id              AS filebucketid,
                                                         stb_ret_s.id        AS storeid,
                                                         count(stb_ret_a.id) AS assignmentcount
                                              FROM       assignment          AS stb_ret_a
                                              JOIN       filebucket          AS stb
                                              ON         stb.id = stb_ret_a.associatedobjectid
                                              AND        stb.id IN (69,50,68)
                                              INNER JOIN entity AS stb_ret_e
                                              ON         stb_ret_e.id = stb_ret_a.assignmentvalue
                                              AND        stb_ret_e.classname = 'Retailer'
                                              AND        stb_ret_a.assignmentsubtype IN ('RetailerID')
                                              JOIN       store stb_ret_s
                                              ON         stb_ret_s.retailerid = stb_ret_e.id
                                              WHERE      stb_ret_a.associatedobjectclass = 'FileBucket'
                                              AND        stb_ret_a.isdeleted = 0
                                              AND        stb_ret_a.assignmentsubtype IN ('RetailerID')
                                              AND        stb_ret_e.isdeleted = 0
                                              GROUP BY   filebucketid,
                                                         storeid
                                              UNION ALL
                                              SELECT     filebucket.id AS filebucketid,
                                                         stb.storeid   AS storeid,
                                                         1             AS assignmentcount
                                              FROM       filebucket
                                              CROSS JOIN
                                                         (
                                                                SELECT maxout.repid,
                                                                       maxout.storeid
                                                                FROM   (
                                                                              SELECT repid,
                                                                                     storeid,
                                                                                     entityid
                                                                              FROM   (
                                                                                              SELECT   mp.id              AS repid,
                                                                                                       mp_cf_csv.entityid AS storeid,
                                                                                                       count(mp_cf_a.id)  AS assignmentcount
                                                                                              FROM     assignment         AS mp_cf_a
                                                                                              JOIN     rep                AS mp
                                                                                              ON       mp.id = mp_cf_a.associatedobjectid
                                                                                              JOIN     `customfieldvalue` AS mp_cf_csv
                                                                                              ON       mp_cf_csv.`value` = REPLACE(REPLACE(substring_index(substring_index(mp_cf_a.assignmentvalue, ',"', -1), ':', -1), '"',''), '}','')
                                                                                              AND      mp_cf_csv.`customfieldid` = REPLACE(substring_index(substring_index(mp_cf_a.assignmentvalue, ',', 1), ':', -1), '"','')
                                                                                              JOIN     entity AS mp_cf_e
                                                                                              ON       mp_cf_e.id = mp_cf_csv.entityid
                                                                                              WHERE    mp_cf_a.associatedobjectid IN (7400,7825,7780,7700)
                                                                                              AND      mp_cf_a.associatedobjectclass = 'Rep'
                                                                                              AND      mp_cf_a.isdeleted = 0
                                                                                              AND      mp_cf_a.assignmentsubtype IN ('CustomFieldValue')
                                                                                              AND      mp_cf_e.isdeleted = 0
                                                                                              GROUP BY repid,
                                                                                                       storeid
                                                                                              UNION ALL
                                                                                              SELECT   mp.id              AS repid,
                                                                                                       mp_ret_s.id        AS storeid,
                                                                                                       count(mp_ret_a.id) AS assignmentcount
                                                                                              FROM     assignment         AS mp_ret_a
                                                                                              JOIN     rep                AS mp
                                                                                              ON       mp.id = mp_ret_a.associatedobjectid
                                                                                              JOIN     store AS mp_ret_s
                                                                                              ON       mp_ret_s.retailerid = mp_ret_a.assignmentvalue
                                                                                              AND      mp_ret_a.assignmentsubtype IN ('RetailerID')
                                                                                              JOIN     entity AS mp_ret_e
                                                                                              ON       mp_ret_e.id = mp_ret_s.id
                                                                                              WHERE    mp_ret_a.associatedobjectid IN (7700,7400,7780,7825)
                                                                                              AND      mp_ret_a.associatedobjectclass = 'Rep'
                                                                                              AND      mp_ret_a.isdeleted = 0
                                                                                              AND      mp_ret_a.assignmentsubtype IN ('RetailerID')
                                                                                              AND      mp_ret_e.isdeleted = 0
                                                                                              GROUP BY repid,
                                                                                                       storeid) orouttie
                                                                              JOIN   masterentity_attachedstores AS
                                                                              ON     orouttie.storeid = AS.storeid
                                                                              AND    AS.masterentityid = 156825
                                                                              JOIN
                                                                                     (
                                                                                              SELECT   associatedobjectid,
                                                                                                       ownerobjectid
                                                                                              FROM     assignment
                                                                                              WHERE    associatedobjectid IN (7400,7700,7780,7825)
                                                                                              AND      associatedobjectclass='Rep'
                                                                                              GROUP BY associatedobjectid) creatorassignment
                                                                              ON     creatorassignment.associatedobjectid = orouttie.repid
                                                                              JOIN   entityuser creatorentity
                                                                              ON     creatorentity.id = creatorassignment.ownerobjectid ) maxout
                                                                JOIN   asshelper0660552001475295480164_out outcreator
                                                                ON     maxout.storeid = outcreator.storeid
                                                                AND    outcreator.usergroupid = maxout.entityid ) stb
                                              WHERE      id IN (60,55,50)
                                              UNION ALL
                                              SELECT stb.id                      AS filebucketid,
                                                     AS.storeid                  AS storeid,
                                                     1                           AS assignmentcount
                                              FROM   masterentity_attachedstores AS
                                              JOIN   entity en
                                              ON     AS.storeid = en.id
                                              JOIN   filebucket AS stb
                                              WHERE  en.isdeleted = 0
                                              AND    AS.masterentityid = 156825
                                              AND    stb.id IN (40)) orouttie
                                JOIN   masterentity_attachedstores AS
                                ON     orouttie.storeid = AS.storeid
                                AND    AS.masterentityid = 156825
                                JOIN
                                       (
                                                SELECT   associatedobjectid,
                                                         ownerobjectid
                                                FROM     assignment
                                                WHERE    associatedobjectid IN (40,50,55,60,65,67,68,69)
                                                AND      associatedobjectclass='FileBucket'
                                                GROUP BY associatedobjectid) creatorassignment
                                ON     creatorassignment.associatedobjectid = orouttie.filebucketid
                                JOIN   entityuser creatorentity
                                ON     creatorentity.id = creatorassignment.ownerobjectid ) maxout
                  JOIN   asshelper0777657001475295479813_out outcreator
                  ON     maxout.storeid = outcreator.storeid
                  AND    outcreator.usergroupid = maxout.entityid ) filebucket0
ON         filebucket0.storeid = usergroup0.storeid
GROUP BY   filebucket0.filebucketid  

Best Answer

Ok, so I've got a solution for myself. Given this simplified example of the query I was trying to optimize....

select group_concat(storeID), fileID 
    from ( select ugfileToStores.storeID, ugfileToStores.fileID
         from ( select usergroupID, storeID from tableOne 
              join ( select fileID, storeID from tableTwo ) tableTwo 
              on tableOne.storeID = tableTwo.storeID ) ugfileToStores
    ) fileToStores 
group by fileID 

I ended up creating two temp tables filled with "tableOne" and "tableTwo"'s info. In reality tableOne and tableTwo are pretty massive subqueries that run fast independently but when joining and grouping the two they suck, performance-wise. So, anyway, I start a db transaction, create the two temp tables, then run the new "main" query which now looks something like this.

select group_concat(storeID), fileID 
    from tempTableOne join tempTableTwo 
      on tempTableOne.storeID = tempTableTwo.storeID 
group by fileID  

I created the two temp tables with indexes and using the memory engine... an example of that given the simplified query is (the example is really too simple because I'm just selecting from one table from it, but in reality it's a huge query)

CREATE TEMPORARY TABLE IF NOT EXISTS 
                tempTableOne (FileID INT, StoreID INT, INDEX stb (`FileID`), INDEX st (`StoreID`)) engine=memory
                 AS ( select FileID, storeID from tableOne )

This is working like a dream. It's running the old query that took ~5 seconds in 90ms and the temp table creation takes a combines .2 seconds. My assumption on what is going on is that those temp tables get created very quickly because I am working off indexes to build them up and then when I create indexes on those temp tables I am able to group by much faster in the final main query!