Why does the SERIALIZABLE transaction isolation level not guarantee point-in-time views of data

isolation-levelsnapshot-isolationsql servert-sqltransaction

I have been told that if your transaction consists of multiple statements, then the SNAPSHOT transaction isolation level is the only way to guarantee consistent point-in-time views of data. Why is the SERIALIZABLE transaction isolation level inadequate for this task? By design, the locks that SERIALIZABLE holds are very tight.

I think the gap in my understanding is that I am unsure when SERIALIZABLE takes its very tight locks. A script like the below will likely be very helpful in finding what I am missing.

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

BEGIN TRANSACTION;

SELECT TOP (1) * FROM [Hit_Me_First];

WAITFOR DELAY '00:02:00';  

SELECT TOP (1) * FROM [Hit_Me_Second];

COMMIT;

Best Answer

People always seem to want to understand isolation levels in terms of locks.

This is somewhat strange because isolation levels are generally not defined by locking, rather by concurrency phenomena* one might experience at each level.

Some products happen to implement some isolation levels using locking, but this isn't true for all levels or products. For example, SQL Server has both lock-based and versioning-based implementations for the read committed isolation level.

Even serializable isolation is implemented in SQL Server without locking when you use the in-memory OLTP (Hekaton) storage engine. PostgreSQL also implements serializable using a versioning scheme by the way.

I have been told that if your transaction consists of multiple statements, then the SNAPSHOT transaction isolation level is the only way to guarantee consistent point-in-time views of data.

Snapshot isolation does guarantee you see only committed data as it was at the start of the transaction. The start of the transaction is the first time you access versioned data within the transaction, not when you issue BEGIN TRAN.

Serializable also provides a consistent view of the data as it was at a point in time, but that point in time is generally not the start of the transaction. To understand this, consider the SQL Standard definition (emphasis added):

A serializable execution is defined to be an execution of the operations of concurrently executing SQL-transactions that produces the same effect as some serial execution of those same SQL-transactions. A serial execution is one in which each SQL-transaction executes to completion before the next SQL-transaction begins.

If the effects are the same as if all transactions ran serially, in some order, clearly you will see only committed data as it was when your transaction ran serially (i.e. with no concurrent activity). The crucial difference is you cannot know exactly what that schedule of serial executions was.

If you really must think about this through the locking implementation, consider that SQL Server takes sufficient locks to ensure all data needed by your transaction does not change after it is read. Further, no new data that would be seen by your transaction if it were re-run can be added until your transaction ends (no 'phantoms').

Through that lens, the point in time view is from when the last lock needed by your transaction was acquired to the end of your transaction. For convenience, you could choose this to mean the time the transaction ended.

Again, I emphasise that this is a consequence of an implementation decision. It could change, in principle. I would encourage you to understand concurrency through the logical guarantees provided and concurrency effects avoided.

For more, see my article The Serializable Isolation Level and the rest of the series for other isolation levels, including snapshot isolation.


* Any effect due to concurrency that would not occur if the transaction ran alone