SQL Server Availability Groups – Monitoring Redo Rates

availability-groupssql server

We currently use Idera Diagnostic Manager to monitor our SQL environment, which now includes a couple of Availability Groups. Idera monitors the redo synch rates and starts to complain when the rates are too low.

My question is this, should we care about the redo rate when there is nothing in the redo queue? It would seem to me that the Log Send Queue and Redo Queue are better indicators of when the replicas are behind on replication. Would an alert based on the these queues be a better alert than alerting on redo alerts?

Thanks.

Best Answer

Should we care about the redo rate when there is nothing in the redo queue?

If there is nothing in the redo queue and the send queue is empty then I wouldn't worry about it as there is literally nothing to be done. This will happen on AGs that have light usage or usage based on specific hours and is then idle the rest of the time.

If, however, there IS something in the log send queue and/or redo queue we are extremely interested in the redo rate.

We are ALWAYS interested in blocked redo threads as well.

It would seem to me that the Log Send Queue and Redo Queue are better indicators of when the replicas are behind on replication. Would an alert based on the these queues be a better alert than alerting on redo alerts?

These all measure different aspects of the AG and where performance might be lacking. For example, we'd want to troubleshoot a large log send queue much differently than a low redo rate. Sure, they may be related but we'd be looking in two different areas at first.

It would be best to build in some logic for the alerts. I would still keep the redo alert and add one for blocked redo threads (since that one wasn't mentioned), but I would monitor the full stack as redo rate is just giving you one small piece of the puzzle.