Mysql – Is this query too slow regarding the job it is doing

innodbMySQLoptimization

It takes around 6 seconds for the following query to execute on a MacBook Air with 1,8 GHz Dual-Core Intel Core and 8 GB 1600 MHz DDR3, running mysql Ver 14.14 Distrib 5.7.27.
I would like to know if that is to be expected given all of the parameters or could it be optimised somehow.

SELECT  DISTINCT `sessions`.`strain_id_id`,
        `strains`.`product_cbd_units`,
        `strains`.`product_thc_units`,
        AVG(`session_symptoms`.`total_efficacy`) AS `efficacy`,
        COUNT(`sessions`.`strain_id_id`) AS `uses`,
        COUNT(DISTINCT `sessions`.`user_id_id`) AS `users`
    FROM  `session_symptoms`
    INNER JOIN  `sessions`
        ON (`session_symptoms`.`session_id_id` = `sessions`.`id`)
    INNER JOIN  `strains`
        ON (`sessions`.`strain_id_id` = `strains`.`id`)
    INNER JOIN  `product_province`
        ON (`strains`.`id` = `product_province`.`strain_id`)
    WHERE  (`session_symptoms`.`symptom_id_id` = 1
              AND  `strains`.`archived` = 0
              AND  `strains`.`is_visible` = 1
              AND  `strains`.`trust_indicator` = '0'
              AND  `product_province`.`country_of_sale` = 'CA'
           )
    GROUP BY  `sessions`.`strain_id_id`, `strains`.`product_cbd_units`,
        `strains`.`product_thc_units`
    HAVING  (AVG(`session_symptoms`.`total_efficacy`) > '0'
              AND  COUNT(DISTINCT `sessions`.`user_id_id`) >= 10
            )
    ORDER BY  `efficacy` DESC
    LIMIT  10

The explain seems OK to me since the product_province table is the only one (nearly) completely scanned (I've tried adding an index on the country_of_sale field but did not see any improvements):

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: product_province
   partitions: NULL
         type: ALL
possible_keys: product_province_strain_id_28314e95_fk_strains_id
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2784
     filtered: 10.00
        Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: strains
   partitions: NULL
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: strainprint.product_province.strain_id
         rows: 1
     filtered: 5.00
        Extra: Using where
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: sessions
   partitions: NULL
         type: ref
possible_keys: PRIMARY,sessions_strain_id_id_524ebbd3_fk_strains_id
          key: sessions_strain_id_id_524ebbd3_fk_strains_id
      key_len: 4
          ref: strainprint.product_province.strain_id
         rows: 63
     filtered: 100.00
        Extra: NULL
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: session_symptoms
   partitions: NULL
         type: ref
possible_keys: session_symptoms_session_id_id_57e72712_fk_sessions_id,session_symptoms_symptom_id_id_af0cc421_fk_symptoms_id
          key: session_symptoms_session_id_id_57e72712_fk_sessions_id
      key_len: 4
          ref: strainprint.sessions.id
         rows: 1
     filtered: 15.68
        Extra: Using where
4 rows in set, 1 warning (0.00 sec)

The session_symptoms table has 1407725 rows.
The sessions table has 865298 rows.
The strains table has 26130 rows.
The product_province table has 2831 rows.

I have been looking at the Innodb_buffer_pool_read_requests and Innodb_buffer_pool_reads variables and these are the values before and after the query runs:

(before running the query)

Innodb_buffer_pool_read_requests    548617058
Innodb_buffer_pool_reads    20782

(after the query was run)

Innodb_buffer_pool_read_requests    555160863
Innodb_buffer_pool_reads    20782

If my understanding is correct, this means that the innodb was able to serve the query results from the in-memory pages. The buffer pool is set to 3GB and is 0.33GB full. Here is the innodb status output:

*************************** 1. row ***************************
  Type: InnoDB
  Name: 
Status: 
=====================================
2020-02-17 14:44:00 0x7f8685fb9700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 57 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 81 srv_active, 0 srv_shutdown, 33534 srv_idle
srv_master_thread log flush and writes: 33615
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 1874
OS WAIT ARRAY INFO: signal count 8043
RW-shared spins 0, rounds 845, OS waits 120
RW-excl spins 0, rounds 2322, OS waits 1211
RW-sx spins 1, rounds 30, OS waits 0
Spin rounds per wait: 845.00 RW-shared, 2322.00 RW-excl, 30.00 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 4257212
Purge done for trx's n:o < 4256947 undo n:o < 0 state: running but idle
History list length 23
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421694729191056, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421694729190136, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421694729189216, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
 ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 0; buffer pool: 0
21600 OS file reads, 759 OS file writes, 336 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 186, seg size 188, 7 merges
merged operations:
 insert 8, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 796967, node heap has 5307 buffer(s)
Hash table size 796967, node heap has 1278 buffer(s)
Hash table size 796967, node heap has 4 buffer(s)
Hash table size 796967, node heap has 7 buffer(s)
Hash table size 796967, node heap has 3 buffer(s)
Hash table size 796967, node heap has 2 buffer(s)
Hash table size 796967, node heap has 6 buffer(s)
Hash table size 796967, node heap has 1270 buffer(s)
37839.69 hash searches/s, 300.36 non-hash searches/s
---
LOG
---
Log sequence number 15773227988
Log flushed up to   15773227988
Pages flushed up to 15773227988
Last checkpoint at  15773227979
0 pending log flushes, 0 pending chkp writes
131 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 3298295808
Dictionary memory allocated 2870340
Buffer pool size   196608
Free buffers       167346
Database pages     21385
Old database pages 8040
Modified db pages  0
Pending reads      0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 21292, created 93, written 500
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 21385, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]

This line:

Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000

should also mean that the buffer pool was fully utilised for the query, right? If that is the case, is it normal for the query to take around 6 seconds given that it did not require reading the pages from the disk?

UPDATE: I've updated the indexes as per Rick James' answer:

ALTER TABLE `sessions` ADD INDEX strain_user (`strain_id_id`, user_id_id)
ALTER TABLE `session_symptoms` ADD INDEX sym_ses_eff (`symptom_id_id`, session_id_id, total_efficacy)
ALTER TABLE `session_symptoms` ADD INDEX ses_sym_eff (session_id_id, `symptom_id_id`, total_efficacy)
ALTER TABLE `strains` ADD INDEX lab_vis_arch_id_thc_cbd (trust_indicator, `is_visible`, archived,
                                                        id, product_thc_units, product_cbd_units)
ALTER TABLE `product_province` ADD INDEX country_strain (`country_of_sale`, strain_id)
ALTER TABLE `product_province` ADD INDEX strain_country (strain_id, country_of_sale)

and the query now takes less than 2.5 seconds. The explain now looks like:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: product_province
   partitions: NULL
         type: ref
possible_keys: product_province_strain_id_28314e95_fk_strains_id,country_strain,strain_country
          key: country_strain
      key_len: 4
          ref: const
         rows: 2386
     filtered: 100.00
        Extra: Using index; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: strains
   partitions: NULL
         type: eq_ref
possible_keys: PRIMARY,lab_vis_arch_id_thc_cbd
          key: PRIMARY
      key_len: 4
          ref: strainprint.product_province.strain_id
         rows: 1
     filtered: 8.74
        Extra: Using where
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: sessions
   partitions: NULL
         type: ref
possible_keys: PRIMARY,sessions_strain_id_id_524ebbd3_fk_strains_id,strain_user
          key: strain_user
      key_len: 4
          ref: strainprint.product_province.strain_id
         rows: 40
     filtered: 100.00
        Extra: Using index
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: session_symptoms
   partitions: NULL
         type: ref
possible_keys: session_symptoms_session_id_id_57e72712_fk_sessions_id,session_symptoms_symptom_id_id_af0cc421_fk_symptoms_id,sym_ses_eff,ses_sym_eff
          key: sym_ses_eff
      key_len: 8
          ref: const,strainprint.sessions.id
         rows: 1
     filtered: 100.00
        Extra: Using index
4 rows in set, 1 warning (0.01 sec)

Best Answer

SELECT DISTINCT and GROUP BY do similar things; don't specify both. Keep the GROUP BY.

Suggested indexes. Most are "covering". Some assume one ordering of the tables in the JOIN; some are optimized for the other ordering; the Optimizer can pick which is better.

sessions:  (strain_id_id, user_id_id, id)
session_symptoms:  (symptom_id_id, session_id_id, total_efficacy)
session_symptoms:  (session_id_id, symptom_id_id, total_efficacy)
strains:  (trust_indicator, is_visible, archived, id, product_thc_units, product_cbd_units)
product_province:  (country_of_sale, strain_id)
product_province:  (strain_id, country_of_sale)

20782 / 555160863 is a very low ratio, thereby indicating that innodb_buffer_pool_size is big enough. But there were 7M "read requests", possibly indicating the need for a lot of data. (Caveat: These numbers are across all connections.)

"10 possible additional filters" -- That could lead to 10 separate optimizations, and possibly 40 more indexes needed! Even if we speed up this version, it may not help the other versions.