Mysql – Best way to optimize an aggregated issues over time query

MySQL

I have a query that creates an aggregated issues over time report that includes back-filling data from entities with missing dates. I've been using the following query for a few years but it is starting max out my DB CPU on big datasets. I'm a little lost on what to focus on first and I could use some guidance.

SELECT t1.yearmonth, SUM(rt.errors) as errors, SUM(rt.alerts) as alerts FROM
  (SELECT d.yearmonth, r.user_id, MAX(r.date) as date FROM
    (SELECT DISTINCT  date_add(date, interval  -WEEKDAY(date)-1 day) as yearmonth FROM reports) d
  INNER JOIN reports r ON d.yearmonth >=  date_add(r.date, interval  -WEEKDAY(r.date)-1 day) 
  GROUP BY d.yearmonth, r.user_id ) t1
INNER JOIN reports r1 ON r1.user_id = t1.user_id AND r1.date = t1.date
INNER JOIN report_totals rt ON rt.report_id = r1.id
GROUP BY t1.yearmonth
ORDER BY t1.yearmonth

The example here creates results based on a weekly interval, but my application supports daily, monthly, quarterly, and yearly as well.

I do have a SQL fiddle setup here for further context
http://sqlfiddle.com/#!9/6c7cc82/7

I think the performance hit may be coming from the date_add and date_format functions, but I'm not sure how to accomplish what I'm doing without them. Maybe a calendar table? Any help would be appreciated, thanks!

Best Answer

Save the aggregation in a "summary table". Then add a new row(s) each night. The 'report' would sum the sums and sum the counts. (Average would be (sum of sums) / (sum of counts).)

I'm confused. You seem to have a date in the Fact table, yet you want to summarize for one day? Or a range of days? Please provide SHOW CREATE TABLE and say approximately how big the tables are.

More discussion: http://mysql.rjweb.org/doc.php/summarytables