Sql-server – At which point in a log backup does SQL Server truncate the log file

sql server

I am investigating an issue where on a somewhat predictable schedule where our third party backup software begins a full backup of some of our databases during the workday. It starts a log backup, detects a broken chain in the LSN's and then converts it to a full backup.

I checked and there are no other log backups being run against the SQL servers we are experiencing this with.

What happens is that a log backup begins on schedule, but on some days the connection with the backup server is lost during the backup. The job sits in limbo and when it restarts with the same job id there is a broken chain detected and it begins a full backup of affected databases.

Scouring the logs it I can see the backup failures and the backup restarting with the same LSN which is no longer there.

So my question is, when there is a log backup with the truncate option does SQL Server truncate the log as it is backed up, or at the successful completion of the job? In my case it seems like a range of LSN's is being backed up, marked as backed up and when the job fails it looks for those same LSN's again when the job restarts.

Best Answer

Technically, the virtual log files (VLFs) are only attempted to be marked as inactive (can be reused) when they have been successfully backed up and are no longer needed by internal (and sometimes external) processes - such as replication or availability groups.

What happens is that a log backup begins on schedule, but on some days the connection with the backup server is lost during the backup.

This shouldn't cause any issues with SQL Server. SQL Server will see that the connection was broken and kill the session plus whatever was executing. This means that backup wasn't successful so the next log backup should start at the same place because it hasn't yet finished successfully.

The job sits in limbo and when it restarts with the same job id there is a broken chain detected and it begins a full backup of affected databases.

Sounds like the application thinks it ran a backup successfully even though it didn't and "detects" there is a problem... but there isn't. It really sounds like the application logic in the backup program is either flawed or running into a config issue... or it might be their standard logic (by design).

Now if it takes a full backup and walks away... that's not going to affect the log reuse.

But on a certain day of the week, at a certain time of day half of the jobs hang because the backup server goes down for a few minutes and when they restart again they begin a full backup on some databases. And from what I can tell it looks for the LSN, runs a backup, job fails, restarts looking for the same LSN.

Sounds like you have other infrastructure issues that may be contributing. Assuming they are not, though, it sounds like the backup application is confused. If the job fails to backup the log successfully it should start at the same place because it hasn't yet been successfully backed up. If the backup application chooses to see that as an issue and take a full backup to reset itself (the backup application) internally, that's on the application vendor to fix/decide/working as intended/whatever after you tell them about it.

However, since this only happens when the backup application server goes down... you may want to also have a stern talking to your infrastructure team and get that issue situated - or at the very least, if it's downtime because of patching or something, that all jobs be held until the patching (or whatever) is completed.

Related Solutions

SQL Server – Best Practices for Maintaining Log File Sizes

Why doesn't the log file shrink after my backups? Is it because there are uncommitted transactions?

The actual NTFS log file doesn't "shrink" from a transaction log backup, but VLFs (Virtual Log Files) within the transaction log are marked for reuse (because they are now backed up and persisted on media) allowing the wrap-around of transaction log use to occur. If you aren't backing up the transaction log, or not frequently enough then there will be not available VLFs and that will cause the transaction log to grow (provided that autogrowth is set) to accommodate additional transaction log entries.

2.At first I was thinking I should shrink the log files after every 5:00 AM backup. After reading up on how that's bad for performance I now believe that I need to take regular log backups every couple of hours during the day. Is that correct?

Routine and scheduled file shrinkage is not a good idea. Only when you need to reclaim much needed space should you consider a DBCC SHINKFILE. Also, when you are continuously growing your transaction log, you could be hindering other things such as the recovery of the database. With too many VLFs in the transaction log (a common problem when the transaction log is only grown by a small storage increment) the amount of time to recovery the database could be longer than desired.

3.My normal full backup of the database/logs happens every day at 5:00 AM and sometimes takes 3 hours. If I schedule the log backups to happen every hour, what will happen when the log backup collides with the 5:00 AM backup?

Nothing will happen, that is a completely legal operation. See this below graph from MSDN. Where there is a black dot, those two operations can not occur at the same time. As you can see, a database backup and a transaction log are allowed concurrently.

enter image description here

The takeaway here is you should be backing up your transaction log more frequently. NTFS file growth isn't the only problem you could run into by not backing up your transaction log more frequently. If you were to have storage failure and your transaction log is lost, then you can only restore to the point in time of your last transaction log backup. If the transaction log is lost, you won't be able to backup the tail of the log and restore to point-in-time of the failure. In your case, you could potentially lose 24 hours worth of data. But if you backup your transaction logs every, say, 30 minutes then your maximum data loss would be 30 minutes. In that case, if your transaction log is gone, and you have your full backup and your intact log chain you could restore to that last log backup.

TechNet documentation on Transaction Log Truncation

Sql-server – Restoring differential backup from SQL Server 2005 to SQL Server 2012

There should be no issue restoring differentials and fulls from SQL 2005 to SQL 2012. I would validate that your backup files are compatible. To do this, you'll want to use the RESTORE HEADERONLY command and compare the full backup's FirstLSN value with the differential's DifferentialBaseLSN:

restore headeronly
from disk='X:\BackupFiles\foo.bak'

enter image description here

If these values do not match, then you will need to take an appropriate full backup.

Best Answer

Related Solutions

SQL Server – Best Practices for Maintaining Log File Sizes

Sql-server – Restoring differential backup from SQL Server 2005 to SQL Server 2012

Related Question