In a recent planning AWS RDS failover we experienced long recovery times of around 20 minutes. AWS provides some hints around why.
large transactions or a lengthy recovery process can increase failover
time.
The database itself doesn't make much use of large transactions, so I don't think this is the culprit. Therefore I'm interested to understand what factors might result in a lengthy recovery process?
- Does the SQL Server version have an impact (We are on 2016)
- Does the size of the database have an impact?
The recovery process took around 20 minutes and we'd like to take steps to reduce this to around 10 minutes.
Update:
One thing I didn't mention is, due to years of bad design about 50% of the database size is due to storing blobs of files, such as html or XML. Could this have any impact on recovery time?
Best Answer
Since you are not using SQL server 2019, ADR won't be applicable. There are few general guidelines, which should be considered in almost all the cases and they are as below:
There is an option of changing target recovery time however that has some down-side also. You should consider them before enabling them.
From MS-site, if you avoid below scenarios, you don't need ADR:
Who should consider accelerated database recovery The following types of customers should consider enabling ADR:
From the comment and new addition to your question about using blob type in your database, yes having these data types can significantly delay any checkpoint by database. In stead, you should use CAS storage type as explained well by Mr. Brent Ozar:
https://www.brentozar.com/archive/2015/03/no-more-blobs/
Hope this helps you in taking right decision.