Sql-server – Clarifying delayed durability

sql server

I'm considering applying delayed durability in a system I work on and want to make sure I have the correct mental model in order to avoid data loss. My understanding is it basically means COMMIT turns into a sort-of "Schrodinger's cat" between COMMIT and ROLLBACK, right? In other words, some transactions that are committed might actually end up rolled back ("the commit is a lie").

Hence, we should not actually expect to see any data loss in scenarios like the following:

  1. A service broker activation procedure that RECEIVEs messages, processes them inside a transaction, and COMMITs. If "the commit is a lie" and the transaction doesn't actually commit, it will be a rollback instead and service broker will automatically rerun it.
  2. A periodic ETL-type job which bulk imports files into a staging table (this part would be fully durable). Afterwards, inside a transaction, it processes the staged data and marks the staged data as processed. If "the commit is a lie" and the transaction doesn't actually commit, it will be a rollback instead and the staged data will not be marked as processed, so we'll do it again next time.

In these scenarios we would not expect actual data loss because the transactions get rerun automatically. In this case it's no more risky than anything else that could interrupt a transaction (e.g. a deadlock or an unplanned shutdown that interrupts the process).

In contrast, a scenario like the following would not be a good candidate for delayed durability and would risk data loss:

  1. An external process (e.g. SSIS), for each file in a directory, begins a transaction, bulk imports the data from the file, and commits the transaction. Once the transaction is committed, it deletes the file. Now, if "the commit is a lie" and the transaction doesn't actually commit, we will have lost the file data for good because we deleted the file but the import was rolled back.

So in sum, delayed durability can lose data only if some external party is relying on the COMMIT. For example, an online checkout enters a sale and reasonably assumes that when the data has been committed, the sale is finished, etc., and informs the user as such. But if it's some processing that can already automatically tolerate and resume from interruptions/rollbacks, there should be no increased risk.

Does that sound right?

Best Answer

That is basically correct.

If a transaction is reading from the database and writing to the database, and you have some mechanism to retry the transaction if it is rolled back, you will have no data loss with delayed durability.

But in both the scenarios you describe, I doubt delayed durability would be useful. Delayed durability can improve throughput where a session is committing a large number of small transactions by eliminating the log flush on commit. In the Service Broker sample, the transaction should be scoped around a batch of messages to reduce the commit frequency. And in the ETL example you have a large transaction, so eliminating the single log flush on commit at the end is unlikely to be a material improvement.

If a client is writing new data to the database with delayed durability, there may be data loss.