Sql-server – Any Reason to have AlwaysOn Failover Clustering in Azure Cloud

availability-groupsazureclusteringsql-server-2016

Is there a reason to have AlwaysOn Failover Clustering when a database is in Azure Cloud?

I understand the benefits having AlwaysOn Availability Groups in Azure Cloud for Database-Level protection. People can transition heavy read queries to the secondary replica (to prevent locking-blocking, tempdb issues), offload maintenance tasks, etc.

However why would someone require Failover Clustering in Azure Cloud for server-level protection? Don't cloud services (both PaaS and IaaS) inherently offer server level protection by itself?

Best Answer

If you are using Azure VMs to host your SQL server, in my personal experience there is one major factor for building an AlwaysOn FCI - the failover process happens much more quickly than the migration of an entire VM.

From what I've seen, you're looking at a difference of tens of seconds - minutes for a VM failover, to 1-2 seconds for an AlwaysOn FCI failover running in synchronous commit mode.

In the organisation I work in, that difference is critical, and an extra 60 seconds of down time can cost the company a significant amount of business and money.

I think it's also important to understand what causes these kind of issues - what will you do if an entire locally redundant system goes down, and immediate failover isn't possible? This actually happened to us, after which we realised that both halves of our two node in AlwaysOn FCI were actually running on the same rack! (Eek) sufficed to say that this is no longer the case. We now replicate between two completely separate regions.

As for Azure SQL databases, this comes down to DR-scale events usually - entire data centre outages etc.

In summary, don't put all your eggs in one basket, and don't assume the word 'cloud' means that you don't need to consider DR and HA.