Sql-server – In SQL Server, what’s the impact of a checkpoint to log records in log buffer

buffer-poolcheckpointlogsql servertransaction-log

Does checkpoint operations flush everything in the log buffer to log file? Or just the log records relating to the dirty pages that are about to be flushed? I found some inconsistent description about this by the top names in the SQL Server industry.

From Kalen Delaney's "Microsoft SQL Server 2012 Internals":

Checkpoint operations also write log records from transactions in progress to disk because the cached log records are also considered to be dirty.

"from transactions in progress", yes that's reasonable, since the log records for committed transactions were already written to disk. So it basically means all unflushed log records will be flushed.

From Itzik Ben-Gan's "Understanding log buffer flushes" at https://sqlperformance.com/2018/11/sql-performance/understanding-log-buffer-flushes:

SQL Server needs to harden dirty data pages, e.g., during a checkpoint process, and the log records representing the changes to those pages were not yet hardened (write ahead logging, or WAL in short)

Are the the log records in log buffer that corresponding to the dirty pages all unflushed log records in log buffer? I'm not sure. Is Itzik's description basically means the same thing as Kalen's description?

Best Answer

We could implement a DBMS such that every thread that needs to write a log record writes it straight to disk immediately and waits for acknowledgement. This would be inefficient since there would be many small IOs and many threads waiting. It is more efficient for each thread to write to an in-memory buffer and have that buffer written to disk as a single large IO. All threads that write log records add to this one buffer and when it is written to disk (i.e. "flushed" or "hardened") all log records from all threads are written.

So when must that buffer be flushed? There are three mandatory circumstances

  1. When a transaction commits.
  2. When a checkpoint is taken.
  3. When the buffer is full.

The system can additionally write the buffer any time it chooses. The

Log records must be flushed before a commit acknowledgement is sent to the client to satisfy the durability guarantee of ACID.

Checkpoints flush dirty pages from the bufferpool to disk, whether the transaction which dirtied those pages has committed or not. On restart after failure the DBMS needs a way to figure out which changes to those dirty pages should be retained and which removed. SQL Server follows the ARIES recovery protocol for recovery and ARIES requires the write ahead log to be flushed before the dirty pages are flushed.

Naturally the buffer must be of finite size and so the system must be able to empty it at any point. The log contains many types of record other than user data changes. ARIES protocol takes care of the various combinations of flushed log, flushed or unflushed bufferpool pages, and committed, uncommitted and rolled back transactions.

The authors you quote are saying the same thing - that log records must be flushed before the corresponding dirty pages can be flushed. This is an echo of the documentation linked previously

SQL Server has logic that prevents a dirty page from being flushed before the associated log record is written.