Sql-server – Full Database backup internals

backupsql server

My understanding of a full backup is (based on https://technet.microsoft.com/en-us/library/2009.07.sqlbackup.aspx – most of the below is copied, from this link):

  1. Force a database checkpoint and make a note of the log sequence number at this point. This flushes all updated-in-memory pages to disk before anything is read by the backup to help minimize the amount of work the recovery part of restore has to do.
  2. Start reading from the data files in the database.
  3. Stop reading from the data files and make a note of the log sequence number of the start of the oldest active transaction at that point
  4. Read as much transaction log as is necessary.

Backing up enough of the transaction log is required so that recovery can successfully run during the restore and so that all pages in the database are at the same point in time—the time at which the data reading portion of the backup operation completed. The transaction log from the beginning of the oldest active (or uncommitted) transaction to the end of the backup, is required to allow recovery to run. The transaction log back from the backup checkpoint to the end of the backup is required to allow all pages to be brought to the same point in time and for transactional consistency.

The minimum log sequence number (LSN) of the transaction log that is included in the full backup is MIN (LSN of backup checkpoint, LSN of oldest active transaction) and it could be for a transaction that began even before the backup began. This ensures that the restored copy of the database (or whatever you restored—page, file, filegroup, database) is completely consistent.

Questions when I read This book from Microsoft:

  1. The book from Microsoft suggests that MIN LSN – "Determines the minimum LSN required for the log backup by taking the earliest in time between the backup end LSN and the oldest active transaction’s LSN." Is this incorrect and should it have been "start" instead of end?

  2. Same page, the para above that on the book says that the back-up end LSN is – "LSN of the start of the oldest active transaction" – but should'nt it be end of the backup?

  3. Same page, notice also that the image is incorrect – see Transactions A and B start twice – I think the 2nd one should be transaction end instead of start?

Bit confused.

Best Answer

I would try to answer the question as whole. Yes I believe book has some discrepancy as far as my knowledge goes.

One thing for sure is that the backup explanation pic mentioned in the book has some typos. It forgot to use word commit and instead continued with word starts. Like Xact A Starts is a transaction that started much before backup and somewhere at last it committed and still the pic shows Xact S starts it should be Xact A commits similarly for transaction B. This is what you mentioned in point 3

Backup Operation

Now let me give you very clean definition of what is MIN LSN here. The MIN LSN would be MINIMUM of (Oldest active transaction which is when transaction C starts , LSN when backup started and checkpoint was fired, in this case red line showing checkpoint). This is always how minimum LSN in case of backup would be calculated. Please note Transaction A started before backup but committed before backup ends so it would not be oldest active transaction and also as per backup operation the pages were read so this would not be the oldest active transaction.

Now let us go to definitions

Backup Start LSN: This is basically "Checkpoint LSN"( As mentioned by Paul Randal in Understanding SQL Server Backups) this is recorded when backup starts and checkpoint is fired.

Backup End LSN: This is basically reading the transaction log again and recording 2 things

  1. The LSN of oldest active transaction.

  2. The LSN when backup operation started, For this I am not sure whether this happens or not IMHO this should be same as Checkpoint LSN or Backup start LSN but may be SQL Server is doing this 2nd time to confirm. I believe it does only what I have mentioned in point 1.

Minimum LSN: The would be minimum of ( LSN of oldest active transaction, Checkpoint/backup start LSN ). The definition in the BOOK is little bit confusing.