UPDATE (April 14 2016):
From SQL 2016 – It Just Runs Faster: Indirect Checkpoint Default on the official team Web Log for Microsoft Customer Service and Support (CSS) SQL Support:
Indirect checkpoint is the recommended configuration, especially on systems with large memory footprints and default for databases created in SQL Server 2016.
Does indirect checkpoint takes into considerations number of dirty buffers instead of number of transaction which usually automatic checkpoint was using?
Once we have set the target_recovery_time, SQL Server internally calculates the Target Dirty Buffer threshold. As the transactions are logged in the transaction log, a Dirty Page List keeps track of the LSNs and dirty buffers which are modified by the transaction. So for indirect checkpoints, the dirty buffers from each transaction along with the LSN are being tracked.
The Recovery Writer (a new background process in 2012) periodically polls the Dirty Page List and if it finds a number of Dirty Pages in the Buffer Pool greater than the Target Dirty Buffer threshold it flushes the Dirty Buffers and moves the minLSN forward.
Does indirect checkpoint setting is more suitable for Data Warehouse vs OLTP type workload?
From BOL, an online transactional workload on a database that is configured for indirect checkpoints could experience performance degradation. This is because the background writer used by indirect checkpoint sometimes increases the total write load for a server instance.
What scenarios you consider for SQL database in question before you start leveraging indirect checkpoint?
This depends on many factors as each environment is different interms of hardware, memory, transactions occuring, etc . I would recommend you to TEST and double TEST this particuliar feature as it can be helpful to you or else can cause detrimental effect on your workload and possibly jeopardise your databases as well. By default, target Recovery time is set to ZERO meaning, change it if you have tested it thorougly and its behaviour is acceptable to you ...
Advantages :
It may improve database recovery time
It may reduce checkpoint I/O as it writes continuously pages to the disk in the background
Disadvantages :
In OLTP workload it can increase overall writes on server by writing continuously pages to the disk in the background which may reduce the performance.
Ref:
Setting indirect checkpoints by database in SQL Server 2012
How do checkpoints work and what gets logged ?
For specific part of Question you asked to Sean.
'thus, when the page is "read" by another transaction, the updated
values are there.': How does SQL Server know that data on the disk is
out of date and therefore should read (upon a select query) from the
memory?
Suppose there is other request which comes and asks to read same page. SQL Server Buffer Manager will check the data cache in the buffer pool to see if it already has the page cached in memory. If the page is already cached, then the results are passed back to the Access Methods(Access Methods contains set of code which handles I/O requests for rows, indexes, pages, allocations and row versions) to continue with operation as requested. If the page isn’t already in cache, then the Buffer Manager will get the page from the database on disk, put it in the data cache, and pass the results to the Access Methods.
Is there an information on the page header that the data in the page
on the disk is stale? Isn't there a performance penalty for this as it
reads from the disk first and goes to memory next.
Yes there is. The header of each page in cache stores details about the last two times it was accessed, and a periodic scan through the cache examines these values. A counter is maintained that is decremented if the page hasn’t been accessed for a while; and when SQL Server needs to free up some cache, the pages with the lowest counter are flushed first. The process of “aging out” pages from cache and maintaining an available amount of free cache pages for subsequent use can be done by any worker thread after scheduling its own I/O or by the lazywriter process.
Having said that I would like you to read How It Works: Bob Dorr's SQL Server I/O Presentation
Take the following as an example update tblxxx set col1 = GetDate()
where the table is 1 billion rows.
Simplified the process to discuss.
Begin transaction--Transaction starts here
Fetch Page--Page is fetched from Disk to memory
Acquire Lock
Acquire Latch
Generate log record--Change to the page generates Log records
Update page LSN to match that of the log record--When change to a page generates Log records They have LSN numbers identifying them and the same LSN number is stamped on header of the page change to which generated these log records.
Change Data (dirty the page)--Since page changes as it was on disk. It is now dirty
Release Latch
Commit Transaction Starting--Commit issued for transaction
FlushToLSN of Commit--LSN corresponding to commit is flushed to log disk and recorded. This is done before dirty data page is flushed to disk
Release locks
Commit Transaction Complete
The entire table won’t fit into memory so lazy writer will write dirty pages to disk. In doing so lazy writer will first call the routine FlushToLSN to flush all log records up to and including the LSN stored in the page header of the dirty page. Then lazy writer, and only then, will it issue the write of the data page. If a power outage occurs at this point the log records can be used to reconstruct the page (rollback the changes).
Notice that the latch only protects the physical access to the in-memory data page only for the small about of time the physical change is being made. The lock protects the associated data until the commit or rollback takes place.
This allows lazy writer to remove a dirty page while the locking structures continue to maintain the transactional integrity.
The question which can also arise is if rollback is issued instead of a commit would SQL Server fetch back in pages to undo the update? The answer is yes. The page could have been removed from buffer pool/page cache by lazy writer after the FlushToLSN took place. When the rollback occurs the pages changed by the transaction may need to be fetched into buffer pool to undo the change.
PS: Feel free to ask if any further clarification is required.
Best Answer
We could implement a DBMS such that every thread that needs to write a log record writes it straight to disk immediately and waits for acknowledgement. This would be inefficient since there would be many small IOs and many threads waiting. It is more efficient for each thread to write to an in-memory buffer and have that buffer written to disk as a single large IO. All threads that write log records add to this one buffer and when it is written to disk (i.e. "flushed" or "hardened") all log records from all threads are written.
So when must that buffer be flushed? There are three mandatory circumstances
The system can additionally write the buffer any time it chooses. The
Log records must be flushed before a commit acknowledgement is sent to the client to satisfy the durability guarantee of ACID.
Checkpoints flush dirty pages from the bufferpool to disk, whether the transaction which dirtied those pages has committed or not. On restart after failure the DBMS needs a way to figure out which changes to those dirty pages should be retained and which removed. SQL Server follows the ARIES recovery protocol for recovery and ARIES requires the write ahead log to be flushed before the dirty pages are flushed.
Naturally the buffer must be of finite size and so the system must be able to empty it at any point. The log contains many types of record other than user data changes. ARIES protocol takes care of the various combinations of flushed log, flushed or unflushed bufferpool pages, and committed, uncommitted and rolled back transactions.
The authors you quote are saying the same thing - that log records must be flushed before the corresponding dirty pages can be flushed. This is an echo of the documentation linked previously