SQL Server 2016 – When to Remove Databases from Availability Group

availability-groupshigh-availabilitysql-server-2016transaction-logtransactional-replication

While patching one of my servers that is part of an availability group,
as a secondary, I had some problems – the real problem was that the host machine of one of the virtual servers in the availability group – run out of disk space for a brief minute or two.

It does not matter what was the problem, for this question.

The facts are:

I had 2 machines on the availability group, lost the secondary.
The availability group was involved in replication and I was getting the following error message:

Replicated transactions are waiting for next Log backup or for
mirroring partner to catch up

which was solved by starting the following trace flag:

DBCC TRACEON(1448,-1)

I find some info about my availability group, and I can see one server is down according to the scripts below:

select * from sys.dm_hadr_cluster_members

SELECT
gs.primary_replica as 'Primary Server',
DB_NAME(dd.database_id) as 'Database Name',
secondary_recovery_health_desc as 'Secondary Server Health Status',
synchronization_state_desc as 'Synchronization State',
database_state_desc as 'Database State',
--suspend_reason_desc as 'Suspended Reason',
--last_sent_time as 'Last Data Send Time',
--last_received_time as 'Last Data Received TIme',
--last_hardened_time as 'Last Hardened Time',
--last_redone_time as 'Last Redone Time',
--log_send_queue_size as 'Log Send Queue Size',
--log_send_rate as 'Log Send Rate',
--redo_queue_size as 'Redo Queue Size',
--redo_rate as 'Rate of Redo',
--filestream_send_rate as 'Filestream Send Rate',
last_commit_time as 'Last Commit Time',
low_water_mark_for_ghosts as 'Low Water Mark for Ghosts'
FROM sys.dm_hadr_availability_group_states as gs
left outer JOIN sys.dm_hadr_database_replica_states as dd ON gs.group_id = dd.group_id
ORDER BY gs.primary_replica DESC, DB_NAME(dd.database_id) DESC

The always On script at this question does not tell me much either

QUESTION:

How much far behind am I in the transaction log, catching up with a replica that will never come back?

I need information to solve this question:

Should I remove the databases from the availability group?

or should I build a new server, add it to the availability group, wait for the logs to catch up?

Best Answer

I have decided to remove the databases from the availability group, on this occasion. for this I have used the following scripts:

-- https://blog.pythian.com/list-of-sql-server-databases-in-an-availability-group/
-- How do we know that this instance is part of Availability Group(s):
SELECT
AG.name AS [Name],
ISNULL(agstates.primary_replica, '') AS [PrimaryReplicaServerName],
ISNULL(arstates.role, 3) AS [LocalReplicaRole]
FROM master.sys.availability_groups AS AG
LEFT OUTER JOIN master.sys.dm_hadr_availability_group_states as agstates
    ON AG.group_id = agstates.group_id
INNER JOIN master.sys.availability_replicas AS AR
    ON AG.group_id = AR.group_id
INNER JOIN master.sys.dm_hadr_availability_replica_states AS arstates
    ON AR.replica_id = arstates.replica_id AND arstates.is_local = 1
ORDER BY [Name] ASC

-- https://blog.pythian.com/list-of-sql-server-databases-in-an-availability-group/
-- How do we get some information about the databases in the Availability Group:
SELECT
AG.name AS [AvailabilityGroupName],
ISNULL(agstates.primary_replica, '') AS [PrimaryReplicaServerName],
ISNULL(arstates.role, 3) AS [LocalReplicaRole],
dbcs.database_name AS [DatabaseName],
ISNULL(dbrs.synchronization_state, 0) AS [SynchronizationState],
ISNULL(dbrs.is_suspended, 0) AS [IsSuspended],
ISNULL(dbcs.is_database_joined, 0) AS [IsJoined],
remove_database='USE [master]' + char(13) + 'ALTER AVAILABILITY GROUP ' + quotename(AG.name ) + char(13) + 'REMOVE DATABASE ' + quotename(dbcs.database_name) + char(13)
FROM master.sys.availability_groups AS AG
LEFT OUTER JOIN master.sys.dm_hadr_availability_group_states as agstates
   ON AG.group_id = agstates.group_id
INNER JOIN master.sys.availability_replicas AS AR
   ON AG.group_id = AR.group_id
INNER JOIN master.sys.dm_hadr_availability_replica_states AS arstates
   ON AR.replica_id = arstates.replica_id AND arstates.is_local = 1
INNER JOIN master.sys.dm_hadr_database_replica_cluster_states AS dbcs
   ON arstates.replica_id = dbcs.replica_id
LEFT OUTER JOIN master.sys.dm_hadr_database_replica_states AS dbrs
   ON dbcs.replica_id = dbrs.replica_id AND dbcs.group_database_id = dbrs.group_database_id
ORDER BY AG.name ASC, dbcs.database_name

Best Answer

Related Solutions

SQL Server AlwaysON – Troubleshooting Log_Send_Queue_Size Issues

SQL Server – How AlwaysOn Availability Group Secondary Replica Catches Up After Downtime

Related Question