Sql-server – If the SIDs for ##certificate-based## SQL Server Logins don’t match across replicas in an Availability Group, will this be a problem on failover

availability-groupssql-server-2016

I am reading mixed messages about server level object syncing between replicas in an Availability Group.

As I understand it, server-level objects must be kept in sync (aside: then why isn't it part of AG by default?). For example, if a Login (identified by SID) does not exist on a Secondary replica (regardless of username and password) and the system fails over to that replica, then the corresponding principal will fail to authenticate.

All well and good – keeping these objects in sync is a task to be done. But what about certificate-based Logins? I read these are for internal system use only, so I would expect the DB engine to maintain SIDs across replicas if that's required. I have a case with three nodes; the third being read-only. I see the Primary's SIDs for logins such as ##MS_SQLAuthenticatorCertificate## do not exist on either of the other nodes. However on those other nodes, there exists a login with the same name and a different SID, but that SID is the same between those two nodes, viz:

SQL01 (Primary) SID = ABC
SQL02 (Secondary) SID = DEF
SQL03 (Read-only) SID = DEF

Is this of concern? Will a failover work despite these Logins having mismatching SIDs between some replicas?

Best Answer

Is this of concern? Will a fail-over work despite these Logins having mismatching SIDs between some replicas?

I don't think this is a concern with fail-over due to following facts:

The fail-over basically targets user databases based on database mirroring endpoint
CONNECT permissions on the ENDPOINT required to user defined LOGIN or SERVICE ACCOUNT (not for server principals named like ##%).

As you mentioned, fundamentally, the SID is the correlation between Login and User in a database level. Since the server level metadata sync is out of scope in availability groups, we must maintain Server Level login's SID same in all the replicas (SQL Instances), so that it would match with User SID in secondary copy of the database.

When fail-over happens there is no failure of application connectivity as the login SID = user SID, eventually application gets the same permissions on the database. Otherwise users in database become Orphaned

Following are some points that indicates why we do not have to concern on server principals that are named '##%':

These are created during SQL Server installation, we do not have control over these to manage SID
These are only internal purpose, meaning only exists in system databases, doesn't have any relation with user database principals (sys.database_principals)
There is no issue with user DB connection, since any application usually doesn't have any relation with the principals that are named like '##%', and the principals (##%) not existed in user database.
Finally, our aim is to match Server principal's SID = database principal's SID within each replica. For the principals that are named like ##% are by default and always maintains same SID within replica which you can verify with following query:

select name, sid from sys.server_principals where name like '##%'

select name, sid from master.sys.database_principals where name like '##%'
select name, sid from msdb.sys.database_principals where name like '##%'

As I understand it, server-level objects must be kept in sync (aside: then why isn't it part of AG by default?).

You're right, it's limitation of availability group which clearly stated here.., in this case contained database feature would be a solution to persists the authentication with database itself.

Not only logins, but also there are server level metadata aspects that need to be manged outside the availability group as required.

Related Solutions

Sql-server – Adding a distributor for availability groups

The statement you are reading in the Microsoft article says that you should not (and cannot) put a Distribution database inside an Availability Group.

Why?

Because the distribution database is a System database and Availability Groups will only fail over User databases. Therefore you will require another (4th) server in your situation to act as the (remote/scaled out) Distributor.

Sql-server – ApplicationIntent=ReadOnly Traffic when no Readable Secondary Available

There are several steps to configuring a server to accept ReadOnly traffic. The following link walks you through it, http://msdn.microsoft.com/en-us/library/hh710054.aspx ,but basically you need to configure each server in the AG and then set up the routing for each.

Here's the T-SQL involved:

ALTER AVAILABILITY GROUP [AG1]
 MODIFY REPLICA ON
N'COMPUTER01' WITH 
(SECONDARY_ROLE (ALLOW_CONNECTIONS = READ_ONLY));
ALTER AVAILABILITY GROUP [AG1]
 MODIFY REPLICA ON
N'COMPUTER01' WITH 
(SECONDARY_ROLE (READ_ONLY_ROUTING_URL = N'TCP://COMPUTER01.contoso.com:1433'));

ALTER AVAILABILITY GROUP [AG1]
 MODIFY REPLICA ON
N'COMPUTER02' WITH 
(SECONDARY_ROLE (ALLOW_CONNECTIONS = READ_ONLY));
ALTER AVAILABILITY GROUP [AG1]
 MODIFY REPLICA ON
N'COMPUTER02' WITH 
(SECONDARY_ROLE (READ_ONLY_ROUTING_URL = N'TCP://COMPUTER02.contoso.com:1433'));

ALTER AVAILABILITY GROUP [AG1] 
MODIFY REPLICA ON
N'COMPUTER01' WITH 
(PRIMARY_ROLE (READ_ONLY_ROUTING_LIST=('COMPUTER02','COMPUTER01')));

ALTER AVAILABILITY GROUP [AG1] 
MODIFY REPLICA ON
N'COMPUTER02' WITH 
(PRIMARY_ROLE (READ_ONLY_ROUTING_LIST=('COMPUTER01','COMPUTER02')));
GO

Sounds like you may be missing the configuration and/or routing information for the primary.

Best Answer

Related Solutions

Sql-server – Adding a distributor for availability groups

Sql-server – ApplicationIntent=ReadOnly Traffic when no Readable Secondary Available

Related Question