We have a number of SQL Servers with Always on Availability Groups in asynchronous mode between a primary and secondary server with Manual fail over.
I created a latency report utilizing the below query that collects the data every min on each server.
On one of our servers the secondary routinely shows that it has an earlier last_commit_time then the primary.
I verified the server times are the same on both servers.
Why might this be?
;WITH
AG_Stats AS
(
SELECT AR.replica_server_name,
AG.name as AGName,
HARS.role_desc,
Db_name(DRS.database_id) [DBName],
DRS.last_commit_time
FROM sys.dm_hadr_database_replica_states DRS
INNER JOIN sys.availability_replicas AR ON DRS.replica_id = AR.replica_id
INNER JOIN sys.dm_hadr_availability_replica_states HARS ON AR.group_id = HARS.group_id
AND AR.replica_id = HARS.replica_id
INNER JOIN [sys].[availability_groups] AG on AG.group_id = AR.group_id
),
Pri_CommitTime AS
(
SELECT replica_server_name
, AGNAME
, DBName
, last_commit_time
FROM AG_Stats
WHERE role_desc = 'PRIMARY'
),
Sec_CommitTime AS
(
SELECT replica_server_name
, AGNAME
, DBName
, last_commit_time
FROM AG_Stats
WHERE role_desc = 'SECONDARY'
)
SELECT p.replica_server_name [primary_replica]
, p.AGNAME
, p.[DBName] AS [DatabaseName]
, s.replica_server_name [secondary_replica]
, DATEDIFF(ss,s.last_commit_time,p.last_commit_time) AS [Sync_Latency_Secs]
FROM Pri_CommitTime p
LEFT JOIN Sec_CommitTime s ON [s].[DBName] = [p].[DBName] and s.AGNAME = p.AGNAME
Best Answer
It is worth noting that the value for
last_commit_time
is defined differently based on whether it is being queried from the secondary or primary replica.In addition to the above, a secondary replica that has a delay of a few seconds on the commit will yield a positive number, but one that hasn't finished the commit yet will yield a negative number in your query.
So if you have a latency of 10 seconds but query at the 5 second mark, since there is no commit time yet on the secondary for the primary's last commit, you're going to get the previous commit time which is probably before the primary's latest commit time. This is expected with asynchronous replication.