SQL Server – How Internal Restoration Works

backuprestoresql server

Suppose my Database DB1 lost some data, but I have a Full, Differential and then followed by Transaction log backups.

Till full\Differential backup I can understand it is like the snapshot of the database, but after this, how T-log Backup works, if we have point in time recovery.

Means like all the committed transaction from the T-log Backup are taken in a sequence and then after it is performed again on the Database (DB1).

Is this right?

If yes, Is this same is followed during log-shipping, i.e, when logs are shipped from one end to other.

Thanks in Advance.

Best Answer

I think this is the core methodology of database backup and restoration and the whole purpose of taking backup is to be able to restore it to point in time in case of any untoward incident like disk failure, server failure or database crash etc.

Lets understand them one by one.

Full Backup

A full backup, as the name implies, backs up everything. It is the foundation of any kind of backup. This is a complete copy, which stores all the objects of the database: Tables, procedures, functions, views, indexes etc. Having a full backup, you will be able to easily restore a database in exactly the same form as it was at the time of the backup.

A full backup creates a complete backup of the database as well as part of the transaction log, so the database can be recovered. This allows for the simplest form of database restoration, since all of the contents are contained in one single backup.

A full backup must be done at least once before any of the other types of backups can be run—this is the foundation for every other kind of backup.

Differential Backup

A differential database backup is the superset of the last full backup and contains all changes that have been made since the last full backup. So, if there are very few transactions that have happened recently, a differential backup might be small in size, but if you have made a large number of transactions, the differential backup could be very large in size.

As a differential backup doesn’t back up everything, the backup usually runs quicker than a full backup. A differential database backup captures the state of the changed extents at the time that backup was created. If you create a series of differential backups, a frequently-updated database is likely to contain different data in each differential. As the differential backups increase in size, restoring a differential backup can significantly increase the time that is required to restore a database. Therefore, it is recommended to take a new full backup, at set intervals, to establish a new differential base for the data.

Differential backups save storage space and the time it takes for a backup. However, as data changes over time, the differential backup size also increases. The longer the age of a differential backup and larger the size and at some point in time it may reach the size of the full backup. A large differential backup loses the advantages of a faster and smaller backup as it requires the full backup to be restored before restoring the recent differential backup. Typically, we would restore the most recent full backup followed by the most recent differential backup that is based on that full backup.

Transaction Log Backup

The log backup, as its name implies, backs up the transaction logs. This backup type is possible only with full or bulk-logged recovery models. A transaction log file stores a series of the logs that provide the history of every modification of data, in a database. A transaction log backup contains all log records that have not been included in the last transaction log backup.

It allows the database to be recovered to a specific point in time. This means that the transaction log backups are incremental and differential backups are cumulative in nature. If you want to restore the database to a specific point in time, you need restore a full, recent differential, and all the corresponding transaction log records which are necessary to build the database up to that specific point, or to a point very close to the desired point in time, just before the occurrence of the accident that resulted in the data loss. This series of modifications is contained and maintained using LSN (Log Sequence Number) in the log chain. A log backup chain is an unbroken series of logs that contain all the transaction log records necessary to recover a database to a point in time. A log chain always starts with a full database backup and continues until for reason it breaks the chain (for example, changing the recovery model of database to simple, or taking an extra full backup), thus by preventing log backups from being taken on the database until another full (or differential) backup is initiated for that database.

For having more clarity, you can refer below image:

enter image description here

Below image will show how to use SSMS for point in time restoration once you are done restoring Full, Differential and prior log backups:

Point in time restoration Screenshot

Log shipping works on same concept however Full and differential backup restoration is part of initial set up whereas log backup and restoration works on the LSN part. Each log backup at primary servers maintains last LSN, same is compared at Secondary server and applied accordingly.

As far as internals of database restoration is concerned, you may find more details of it at these links:

Basically they go through three phases in case of any failure or non-clean shutdown:

  1. Analysis Phase analyzes the transaction log to determine what is the last checkpoint, and creates the Dirty Page Table (DPT) and the Active Transaction Table (ATT). The DPT contains records of pages that were dirty at the time the database was shut down. The ATT contains records of transactions that were active at the time the database was not cleanly shut down.

  2. Redo Phase rolls forwards every modification recorded in the log that may not have been written to the data files at the time the database was shut down. The minimum log sequence number (minLSN) required for a successful database-wide recovery is found in the DPT, and marks the start of the redo operations needed on all dirty pages. At this phase, the SQL Server Database Engine writes to disk all dirty pages belonging to committed transactions.

  3. Undo Phase rolls back incomplete transactions found in the ATT to make sure the integrity of the database is preserved. After rollback, the database goes online, and no more transaction log backups can be applied to the database.

When we are restoring database from Full and Differential backup and Log backup, without using recovery command i.e. with NORECOVERY 2nd phase will be performed which is redo upto the last committed transactions. When you issue command for recovery i.e. in the last log backup restoration or point in time restoration, undo will be performed and no more log backups can be applied afterwards.

First phase i.e. analysis phase is when database comes up from the failure or unclean shutdown and won't be applicable in case of backup restoration.

Hope above helps.