I am using a MySQL 5.1.69-log DB for analysis purpose.
I've designed my database in a star schema. All tables are using InnoDB.
My innodb_buffer_pool_size=2Gb
This is the query I am running :
select
`dw_dimension_criteria`.`id` as `c0`,
count(distinct `dw_fact_program_attendance`.`client_id`) as `m0`
from
`dw_dimension_criteria` as `dw_dimension_criteria`,
`dw_fact_program_attendance` as `dw_fact_program_attendance`
where
`dw_fact_program_attendance`.`criteria_id` = `dw_dimension_criteria`.`id`
group by `dw_dimension_criteria`.`id`;
The EXPLAIN gives the following :
+----+-------------+----------------------------+-------+---------------+--------------+---------+-----------------------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------------+-------+---------------+--------------+---------+-----------------------------------------+--------+-------------+
| 1 | SIMPLE | dw_dimension_criteria | index | PRIMARY | PRIMARY | 2 | NULL | 1 | Using index |
| 1 | SIMPLE | dw_fact_program_attendance | ref | criteria_id | criteria _id | 2 | clarity_nevada.dw_dimension_criteria.id | 417669 | |
+----+-------------+----------------------------+-------+---------------+--------------+---------+-----------------------------------------+--------+-------------+
and the SHOW PROFILE command the following :
+----------------------+-----------+
| Status | Duration |
+----------------------+-----------+
| starting | 0.000117 |
| checking permissions | 0.000035 |
| checking permissions | 0.000009 |
| Opening tables | 0.000022 |
| System lock | 0.000004 |
| Table lock | 0.000011 |
| init | 0.000032 |
| optimizing | 0.000012 |
| statistics | 0.000045 |
| preparing | 0.000017 |
| executing | 0.000043 |
| Sorting result | 0.000005 |
| Sending data | 12.847928 |
| end | 0.000103 |
| removing tmp table | 0.000016 |
| end | 0.000016 |
| query end | 0.000005 |
| freeing items | 0.000114 |
| logging slow query | 0.000003 |
| logging slow query | 0.000097 |
| cleaning up | 0.000012 |
+----------------------+-----------+
I do not understand those 12 seconds. I do not understand if it belongs to Sorting Result or Sending Data. I tried adjusting sort_buffer_size
but with no flagrant success. The fact table count about 7 millions rows.
Where should I look to drop those 12 seconds to a more reasonable time ?
Best Answer
If there is a foreign key from
dw_fact_program_attendance(criteria_id)
thatREFERENCES dw_dimension_criteria(id)
, there is no need to join the two tables (but the optimizer is not yet smart enough to understand that.) You can have only one table andGROUP BY criteria_id
:An index on
(criteria_id, client_id)
would help efficiency.