Sql-server – How to handle maintenance window for DDL commands on Primary replicating to readable secondary AGs

availability-groupsblockingperformancesql serversql-server-2017

We are currently running into this issue for one of our availability group design:-

On Primary :- Nightly Table switch/Purges are happening (DDL) for approx. 2-3 hours

On Secondary which is configured as readable replica there will be SELECT queries running every now and then , as we do not have a window defined for secondary, REDO thread on secondary tends to get blocked by those SELECT queries which keep coming.

Currently to mitigate this issue, manually kill all select queries which keep blocking REDO thread ( this is in turn block other readers) until we see last commit time or redone time from SSMS AG dashboard sync up with the time DDL operation is completed.

Database in question is RCSI

Questions:-

  1. Is this expected way to keep killing SELECT if you don't have a defined maint'ce window for secondary server

  2. How are you guys dealing with this situation? I mean is there a way or script available online that it is automated to KILL such transactions on secondary in FLY when DDL is running on primary and wait until committed on secondary "?

Best Answer

  1. disable read on secondary:

    USE [master] GO LTER AVAILABILITY GROUP [AG-Something] MODIFY REPLICA ON N'SQLSERVER-SECONDARY' WITH (SECONDARY_ROLE(ALLOW_CONNECTIONS = NO)) GO

  2. do your ddll workload on primary

  3. re enable read on secondary

    USE [master] GO

    ALTER AVAILABILITY GROUP [AG-Something] MODIFY REPLICA ON N'SQLSERVER-SECONDARY' WITH (SECONDARY_ROLE(ALLOW_CONNECTIONS = READ_ONLY))

    GO

you can chek the AG status between the step 2 and 3 with this query:

SELECT 
ar.replica_server_name, 
adc.database_name, 
ag.name AS ag_name, 
case HDRS.is_primary_replica
when 1 then 'Primary Replica'
else 'Secondary Replica'
end as Replica,
HDRS.synchronization_state_desc, 
HDRS.synchronization_health_desc, 
HDRS.recovery_lsn, 
HDRS.truncation_lsn, 
HDRS.last_sent_lsn, 
HDRS.last_sent_time, 
HDRS.last_received_lsn, 
HDRS.last_received_time, 
HDRS.last_hardened_lsn, 
HDRS.last_hardened_time, 
HDRS.last_redone_lsn, 
HDRS.last_redone_time, 
HDRS.log_send_queue_size, 
HDRS.log_send_rate, 
HDRS.redo_queue_size, 
HDRS.redo_rate, 
HDRS.filestream_send_rate, 
HDRS.end_of_log_lsn, 
HDRS.last_commit_lsn, 
HDRS.last_commit_time
FROM sys.dm_hadr_database_replica_states AS HDRS
INNER JOIN sys.availability_databases_cluster AS adc 
ON HDRS.group_id = adc.group_id AND 
HDRS.group_database_id = adc.group_database_id
INNER JOIN sys.availability_groups AS ag
ON ag.group_id = HDRS.group_id
INNER JOIN sys.availability_replicas AS ar 
ON HDRS.group_id = ar.group_id AND 
HDRS.replica_id = ar.replica_id