Sql-server – Automatic Page Repair without AlwaysOn or Mirroring

availability-groupssql serversql-server-2012

Do Microsoft SQL Server 2012 Enterprise regular databases (without AlwaysOn) have Automatic Page Repair in any manner?

I know from below, by incorporating AlwaysOn, Secondary replicas receive Automatic Page repair. Just inquiring if this feature is available to SQL Server databases without Alwayson?

Per MSDN documentation https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/overview-of-always-on-availability-groups-sql-server?view=sql-server-2017 , it states

"Each availability replica tries to automatically recover from corrupted pages on a local database by resolving certain types of errors that prevent reading a data page. If a secondary replica cannot read a page, the replica requests a fresh copy of the page from the primary replica. If the primary replica cannot read a page, the replica broadcasts a request for a fresh copy to all the secondary replicas and gets the page from the first to respond"

Best Answer

Yes, if the database participates in Database Mirroring.

But otherwise, no. If you want page repair, you'll need to do it manually with page level restores.

Related Solutions

Sql-server – How Automatic Failover Works (AlwaysOn)

Does it mean that former primary replica (A) will become primary replica automatically?

No, all that means is when your replica comes back into the picture, and when the Availability Group database gets back in a SYNCHRONIZED state that it would be failover ready. That operation will not happen automatically. You indeed would have to either do this "failback" manually, or engineer a way to automate this (rather simple, let me know if you want to explore those options).

From a high-level view, your listing of steps is complete after step #3.

Sql-server – AlwaysOn Availability Group Automatic Failover does not work

If I disconnect DEV-AWEB5

Define "disconnect", if you will. My guess is you kept the box up but took SQL Server down.

I cannot connect to the Group Listener (DevListener), but I can ping it and it will respond to my ping

That's because the listener is just a virtual network name (VNN) within the WSFC cluster resource group for the represented availability group. Your DEV_AWEB5 node still owns the cluster resource group, but it's just the AG cluster resource most likely that is in a failed state. The VNN must still be online (expected behavior). It's simply pointing to whatever node is owning that resource group (in this case, DEV-AWEB5). In fact, if you had PowerShell remoting enabled, and you ran the following:

Invoke-Command -ComputerName "YourListenerName" -ScriptBlock { $env:computername }

Likewise, if you can RDP into DEV-AWEB5 (provided you have the capability and accessibility, etc.) then you'd be able to RDP using the listener name (mstsc /v:YourListenerName). It's just a VNN.

The return of that would be the computer name of your owning node.

By all of your symptoms, I'd be willing to bet that you've reached your failover threshold. The failover threshold determines how many times the cluster will attempt to failover your resource group in a specified time period. The default of these values max failovers n - 1 (where n is the count of nodes) in a period of 6 hours. You can see that through the following WSFC PowerShell command:

Get-ClusterGroup -Name "YourAgName" |
    Select-Object Name, FailoverThreshold, FailoverPeriod

That just gives you the settings (which you can modify if you so choose, of course).

The best way to prove that this is the case for you, you would need to generate the cluster log (the system event logs only go into detail as far as " has failed", or something like that).

Get-ClusterLog -Node "YourClusterNode" -TimeSpan <amount_of_minutes_since_failure>

That'll by default get put into the "C:\Windows\Cluster\Reports" folder, and the file is called "Cluster.log".

If you were to open up that cluster log, you should be able to find the following string in there, indicating exactly what happened and why it happened:

Not failing over group [YourClusterGroupName], failoverCount [# of failovers], failover threshold [failover threshold value], nodeAvailCount [node available count].

The above message is simply WSFC telling you that it will not failover your group because it's happened too much (you hit the threshold).

Why does this happen? Simply to prevent the Ping-Pong effect of cluster resources going back and forth too frequently between nodes.

Whereas this would be common to hit these thresholds in failover testing, in production it would typically point to a problem that should be investigated.

Best Answer

Related Solutions

Sql-server – How Automatic Failover Works (AlwaysOn)

Sql-server – AlwaysOn Availability Group Automatic Failover does not work

Related Question