Mysql – Optimize MySQL average query for millions of rows

MySQLmysql-8.0

I have 4 tables (I've specified main columns for each):

  1. monitors (id)
  2. monitor_node (monitor_id, node_id, average_response_time)
  3. monitor_checks (id, monitor_id, node_id, response_time)
  4. nodes (id)

Their relations:

1 monitor – N checks

N monitor – N nodes

1 check – 1 monitor & 1 node


So, I'm logging a lot of checks for each monitor. Monitor_checks will have millions of rows (approximately up to 500 millions).

When I insert another bunch of checks (~1k checks), I need to calculate average response time per node (table monitor_node, column average_response_time).

I'm pretty sure I'm doing it the wrong way and I need a faster solution. What I do now: after inserting 1k rows in monitor_checks I calculate average response time for each monitor (grouping by monitor_id). Then, based on this info I make an array and use insert ... on duplicate key update for monitor_node table, in order to do bulk update. Apart from average response time I calculate some other attributes, which goes along with response time in this bulk update.

Making the array with info and executing insert ... on duplicate key update is fast enough.

The slow query is:

select monitor_id, avg(response_time) as avg_response_time 
from `monitor_checks` 
where `node_id` = 2 
group by `monitor_id`

which takes like ~10-20 seconds for ~4m rows I guess.

I also realised that it's not necessary to get average time based on all checks, I can as well use last 50-100 rows. But I couldn't figure out how to do this. I got a fancy solution for grouping checks for each monitor and limit them for some number of rows: How to get latest 2 records of each group but it took way too long as well.

So the question is: how to quickly recalculate average response times in monitor_node when inserting ~1k rows in monitor_checks table?

DB Fiddle: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=bd95afc030361bf1d87f8bc5c3935c2f

Final desired result:

monitor_id  node_id average_response_time
1   1   0.30
1   2   0.25
2   1   0.55 
2   2   0.65

Best Answer

I would add an array to monitor_node, containing the most recent 100 response times. When adding recent response times, drop those exceeding 100. No need for extra data when a new average has to be calculated.

As an image may say more than a thousand words, I've updated the DB Fiddle. A trigger does the job, but that logic could be incorporated in the job doing the inserts.

create trigger monitor_checks_air
  after insert on monitor_checks for each row
  begin
  update monitor_node mn
    set mn.rec_resp_times =
          json_extract(
             json_array_insert(mn.rec_resp_times,'$[0]',new.response_time),
             '$[0 to 99]'),
        mn.average_response_time = (
            select sum(jt.rt)
              from json_table(mn.rec_resp_times,
                             '$[*]' columns( rt double path '$[0]')) as jt
          )
          / json_length(mn.rec_resp_times) 
    where mn.monitor_id = new.monitor_id
      and mn.node_id    = new.node_id;
  end;

The fastest query is the one you don't execute.