Sql-server – Can someone explain why select with nolock will query a potion of updated data

nolockselectsql server

I was reading the answer from here (from stackoverflow, I think should ask in here)

NOLOCK means placing no locks at all.

Your query may returns portions of data as of before UPDATE and
portions as of after UPDATE in a single query.

I get that nolock will not place lock to the table, so other people can query the same time.

From the answer and example it show, it fetch data while the data is being updating.

Why does that happen?

I am assuming for normal select it will try place lock on table, so when update statement is executed, it place a lock on the row or page. Then when I try to run select statement, it cannot put the lock until the update statement lock is released.

But in this case because the select statement doesn't try to put lock on the table, so it can run without waiting for the update statement release the lock?

Best Answer

It is not quite true that NOLOCK means placing no locks at all. Queries under this hint will still take Sch-S locks and (possibly HOBT locks).

Under read committed isolation level SQL Server will (usually) take row level S locks and release them as soon as the data is read. These are incompatible with the X locks held on uncommited updates and thus prevent dirty reads.

In the example in the linked answer the SELECT query is not blocked when it encounters a modified row so reading partial updates is quite likely.

It can also happen at default read committed isolation level too though that a SELECT reads some rows with the "before" value and others with the "after" value. It is just needed to engineer a situation where

  1. Select query reads value of row R1 and releases its S lock
  2. Update query updates R2 and takes an X lock
  3. Select query tries to read R2 and is blocked.
  4. Update query updates R1 and takes an X lock.
  5. Update transaction commits thus releasing its locks and allowing the Select to read R2

This type of situation might arise for example if the SELECT and UPDATE are using different indexes to locate the rows of interest.

Example

CREATE TABLE T
(
X INT IDENTITY PRIMARY KEY,
Y AS -X UNIQUE,
Name varchar(10),
Filler char(4000) DEFAULT 'X'
)


INSERT INTO T (Name)
SELECT TOP 2500 'A'
FROM master..spt_values

Now in one query window run

DECLARE @Sum int

SELECT 'SET @@ROWCOUNT' WHERE 1=0

WHILE (@@ROWCOUNT = 0)
SELECT @Sum = SUM(LEN(Name))
FROM T 
WHERE Y IN (-1, -2500)
HAVING SUM(LEN(Name)) = 3

This will run in an infinite loop. In another run

UPDATE T 
SET Name=CASE WHEN Name = 'A' THEN 'AA' ELSE 'A' END

This will likely stop the loop in the other query (try again if not) meaning that it must have read either A,AA or AA,A