I have the following query which counts when the event_time is equal to the day of the month.
I also have indexes on the id of each table, template_name, from_email_address, event_time, event_status.
Is there any more way I can optimise it?
I am running this against a database of at least 30 million records and this query will likely scan through at least 3 million records. I also have to run this query multiple times for different from_email_addresses.
Edited to include the explain query result.
select_type tbl type possible_keys key
PRIMARY e ref PRIMARY, IDX_FROM_EMAIL_ADDRESS IDX_FROM_EMAIL_ADDRESS
PRIMARY d eq_ref PRIMARY PRIMARY
PRIMARY s ref PRIMARY, IDX_EVENT_TIME PRIMARY
DEPENDENT SUBQUERY ms ref PRIMARY, IDX_EVENT_STATUS IDX_EVENT_STATUS
ref rows Extra
const 3,012,345 Using where, Using index, Using temporary, Using filesort
e.id 1
e.id,func 1 Using where; Using index
const,d.id 1 Using where; Using index
SELECT
d.template_name,
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 1 THEN 1 END) AS '1',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 2 THEN 1 END) AS '2',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 3 THEN 1 END) AS '3',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 4 THEN 1 END) AS '4',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 5 THEN 1 END) AS '5',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 6 THEN 1 END) AS '6',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 7 THEN 1 END) AS '7',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 8 THEN 1 END) AS '8',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 9 THEN 1 END) AS '9',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 10 THEN 1 END) AS '10',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 11 THEN 1 END) AS '11',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 12 THEN 1 END) AS '12',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 13 THEN 1 END) AS '13',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 14 THEN 1 END) AS '14',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 15 THEN 1 END) AS '15',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 16 THEN 1 END) AS '16',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 17 THEN 1 END) AS '17',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 18 THEN 1 END) AS '18',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 19 THEN 1 END) AS '19',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 20 THEN 1 END) AS '20',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 21 THEN 1 END) AS '21',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 22 THEN 1 END) AS '22',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 23 THEN 1 END) AS '23',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 24 THEN 1 END) AS '24',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 25 THEN 1 END) AS '25',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 26 THEN 1 END) AS '26',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 27 THEN 1 END) AS '27',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 28 THEN 1 END) AS '28',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 29 THEN 1 END) AS '29',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 30 THEN 1 END) AS '30',
COUNT(CASE WHEN DAYOFMONTH(s.event_time) = 31 THEN 1 END) AS '31',
COUNT(*) AS 'Grand Total'
FROM
detail d
LEFT JOIN
email e on d.id = e.id
LEFT JOIN
status s on s.id = d.id
AND s.event_time = (
SELECT
MINE(ms.event_time)
FROM
status ms
WEHRE
ms.id = d.id
AND
ms.event_status = 'SENT'
)
WHERE
CASET(s.event_time AS DATE) BETWEEN '2020-03-01' AND '2020-03-31'
AND
e.from_email_address = 'abc@google.com'
GROUP BY
d.template_name
Best Answer
A "better" way:
Step 1. Build an intermediate table:
Step 2. Pivot. (Since this is thoroughly discussed in other Q&A, I simply added the
[pivot]
tag; go see.)More concise, though maybe not any faster than the
CASE(...31...)
Faster
Build and maintain a Summary table. Rather than rescanning millions of rows when you want this 'report', scan a much smaller and possibly better-indexed summary table. http://mysql.rjweb.org/doc.php/summarytables
Such a table might effectively be the result of my "Step 1", above. However, it would be augmented daily (or continually), not rebuilt.