Mysql – Is this query too slow regarding the job it is doing

It takes around 6 seconds for the following query to execute on a MacBook Air with 1,8 GHz Dual-Core Intel Core and 8 GB 1600 MHz DDR3, running mysql Ver 14.14 Distrib 5.7.27.
I would like to know if that is to be expected given all of the parameters or could it be optimised somehow.

SELECT  DISTINCT `sessions`.`strain_id_id`,
        `strains`.`product_cbd_units`,
        `strains`.`product_thc_units`,
        AVG(`session_symptoms`.`total_efficacy`) AS `efficacy`,
        COUNT(`sessions`.`strain_id_id`) AS `uses`,
        COUNT(DISTINCT `sessions`.`user_id_id`) AS `users`
    FROM  `session_symptoms`
    INNER JOIN  `sessions`
        ON (`session_symptoms`.`session_id_id` = `sessions`.`id`)
    INNER JOIN  `strains`
        ON (`sessions`.`strain_id_id` = `strains`.`id`)
    INNER JOIN  `product_province`
        ON (`strains`.`id` = `product_province`.`strain_id`)
    WHERE  (`session_symptoms`.`symptom_id_id` = 1
              AND  `strains`.`archived` = 0
              AND  `strains`.`is_visible` = 1
              AND  `strains`.`trust_indicator` = '0'
              AND  `product_province`.`country_of_sale` = 'CA'
           )
    GROUP BY  `sessions`.`strain_id_id`, `strains`.`product_cbd_units`,
        `strains`.`product_thc_units`
    HAVING  (AVG(`session_symptoms`.`total_efficacy`) > '0'
              AND  COUNT(DISTINCT `sessions`.`user_id_id`) >= 10
            )
    ORDER BY  `efficacy` DESC
    LIMIT  10

The explain seems OK to me since the product_province table is the only one (nearly) completely scanned (I've tried adding an index on the country_of_sale field but did not see any improvements):

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: product_province
   partitions: NULL
         type: ALL
possible_keys: product_province_strain_id_28314e95_fk_strains_id
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2784
     filtered: 10.00
        Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: strains
   partitions: NULL
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: strainprint.product_province.strain_id
         rows: 1
     filtered: 5.00
        Extra: Using where
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: sessions
   partitions: NULL
         type: ref
possible_keys: PRIMARY,sessions_strain_id_id_524ebbd3_fk_strains_id
          key: sessions_strain_id_id_524ebbd3_fk_strains_id
      key_len: 4
          ref: strainprint.product_province.strain_id
         rows: 63
     filtered: 100.00
        Extra: NULL
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: session_symptoms
   partitions: NULL
         type: ref
possible_keys: session_symptoms_session_id_id_57e72712_fk_sessions_id,session_symptoms_symptom_id_id_af0cc421_fk_symptoms_id
          key: session_symptoms_session_id_id_57e72712_fk_sessions_id
      key_len: 4
          ref: strainprint.sessions.id
         rows: 1
     filtered: 15.68
        Extra: Using where
4 rows in set, 1 warning (0.00 sec)

The session_symptoms table has 1407725 rows.
The sessions table has 865298 rows.
The strains table has 26130 rows.
The product_province table has 2831 rows.

I have been looking at the Innodb_buffer_pool_read_requests and Innodb_buffer_pool_reads variables and these are the values before and after the query runs:

(before running the query)

Innodb_buffer_pool_read_requests    548617058
Innodb_buffer_pool_reads    20782

(after the query was run)

Innodb_buffer_pool_read_requests    555160863
Innodb_buffer_pool_reads    20782

If my understanding is correct, this means that the innodb was able to serve the query results from the in-memory pages. The buffer pool is set to 3GB and is 0.33GB full. Here is the innodb status output:

*************************** 1. row ***************************
  Type: InnoDB
  Name: 
Status: 
=====================================
2020-02-17 14:44:00 0x7f8685fb9700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 57 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 81 srv_active, 0 srv_shutdown, 33534 srv_idle
srv_master_thread log flush and writes: 33615
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 1874
OS WAIT ARRAY INFO: signal count 8043
RW-shared spins 0, rounds 845, OS waits 120
RW-excl spins 0, rounds 2322, OS waits 1211
RW-sx spins 1, rounds 30, OS waits 0
Spin rounds per wait: 845.00 RW-shared, 2322.00 RW-excl, 30.00 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 4257212
Purge done for trx's n:o < 4256947 undo n:o < 0 state: running but idle
History list length 23
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421694729191056, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421694729190136, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421694729189216, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
 ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 0; buffer pool: 0
21600 OS file reads, 759 OS file writes, 336 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 186, seg size 188, 7 merges
merged operations:
 insert 8, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 796967, node heap has 5307 buffer(s)
Hash table size 796967, node heap has 1278 buffer(s)
Hash table size 796967, node heap has 4 buffer(s)
Hash table size 796967, node heap has 7 buffer(s)
Hash table size 796967, node heap has 3 buffer(s)
Hash table size 796967, node heap has 2 buffer(s)
Hash table size 796967, node heap has 6 buffer(s)
Hash table size 796967, node heap has 1270 buffer(s)
37839.69 hash searches/s, 300.36 non-hash searches/s
---
LOG
---
Log sequence number 15773227988
Log flushed up to   15773227988
Pages flushed up to 15773227988
Last checkpoint at  15773227979
0 pending log flushes, 0 pending chkp writes
131 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 3298295808
Dictionary memory allocated 2870340
Buffer pool size   196608
Free buffers       167346
Database pages     21385
Old database pages 8040
Modified db pages  0
Pending reads      0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 21292, created 93, written 500
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 21385, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]

This line:

Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000

should also mean that the buffer pool was fully utilised for the query, right? If that is the case, is it normal for the query to take around 6 seconds given that it did not require reading the pages from the disk?

UPDATE: I've updated the indexes as per Rick James' answer:

ALTER TABLE `sessions` ADD INDEX strain_user (`strain_id_id`, user_id_id)
ALTER TABLE `session_symptoms` ADD INDEX sym_ses_eff (`symptom_id_id`, session_id_id, total_efficacy)
ALTER TABLE `session_symptoms` ADD INDEX ses_sym_eff (session_id_id, `symptom_id_id`, total_efficacy)
ALTER TABLE `strains` ADD INDEX lab_vis_arch_id_thc_cbd (trust_indicator, `is_visible`, archived,
                                                        id, product_thc_units, product_cbd_units)
ALTER TABLE `product_province` ADD INDEX country_strain (`country_of_sale`, strain_id)
ALTER TABLE `product_province` ADD INDEX strain_country (strain_id, country_of_sale)

and the query now takes less than 2.5 seconds. The explain now looks like:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: product_province
   partitions: NULL
         type: ref
possible_keys: product_province_strain_id_28314e95_fk_strains_id,country_strain,strain_country
          key: country_strain
      key_len: 4
          ref: const
         rows: 2386
     filtered: 100.00
        Extra: Using index; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: strains
   partitions: NULL
         type: eq_ref
possible_keys: PRIMARY,lab_vis_arch_id_thc_cbd
          key: PRIMARY
      key_len: 4
          ref: strainprint.product_province.strain_id
         rows: 1
     filtered: 8.74
        Extra: Using where
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: sessions
   partitions: NULL
         type: ref
possible_keys: PRIMARY,sessions_strain_id_id_524ebbd3_fk_strains_id,strain_user
          key: strain_user
      key_len: 4
          ref: strainprint.product_province.strain_id
         rows: 40
     filtered: 100.00
        Extra: Using index
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: session_symptoms
   partitions: NULL
         type: ref
possible_keys: session_symptoms_session_id_id_57e72712_fk_sessions_id,session_symptoms_symptom_id_id_af0cc421_fk_symptoms_id,sym_ses_eff,ses_sym_eff
          key: sym_ses_eff
      key_len: 8
          ref: const,strainprint.sessions.id
         rows: 1
     filtered: 100.00
        Extra: Using index
4 rows in set, 1 warning (0.01 sec)

Best Answer

SELECT DISTINCT and GROUP BY do similar things; don't specify both. Keep the GROUP BY.

Suggested indexes. Most are "covering". Some assume one ordering of the tables in the JOIN; some are optimized for the other ordering; the Optimizer can pick which is better.

sessions:  (strain_id_id, user_id_id, id)
session_symptoms:  (symptom_id_id, session_id_id, total_efficacy)
session_symptoms:  (session_id_id, symptom_id_id, total_efficacy)
strains:  (trust_indicator, is_visible, archived, id, product_thc_units, product_cbd_units)
product_province:  (country_of_sale, strain_id)
product_province:  (strain_id, country_of_sale)

20782 / 555160863 is a very low ratio, thereby indicating that innodb_buffer_pool_size is big enough. But there were 7M "read requests", possibly indicating the need for a lot of data. (Caveat: These numbers are across all connections.)

"10 possible additional filters" -- That could lead to 10 separate optimizations, and possibly 40 more indexes needed! Even if we speed up this version, it may not help the other versions.

Best Answer

Related Solutions

Mysql – High Disk IO, How to mitigate

Mysql – EXPLAIN output suggests that the index is not being used

Related Question