Sql-server – SQL Server 2008 R2 Dirty reads – how non-atomic

sql serversql-server-2008sql-server-2008-r2

I am wondering "how dirty" dirty reads can get under a read-uncommited isolation level. I understand that rows that have been updated but not yet committed are visible, but:

  1. Can a row appear as partially updated – that is, some of the columns are updated and some are not?
  2. Can a single column appear partially updated. For example, if you have a varchar(4000) column that was in the process of being completely updated and assuming it actually contains 4000 chars. Can you read say 2k chars from the previous state and 2k chars from its new state?
    What about varchar(max) with length >8k?

Update: After some debating, the minimal consensus is that if the column size is >8KB, dirty reads, even within the column itself, are possible.

Best Answer

EDITED after reading the MSDN forum link from the comment, very interesting.

Regardless of isolation level, two users cannot update a single page simultaneously, nor can any user read a partially updated page. Just imagine how SQL Server would deal with a page where the header says Col3 starts at byte 17. But it really starts at byte 25, because that part of the row hasn't been updated yet. There's no way a database could handle that.

But for rows larger than 8k, multiple pages are used, and that makes a half-updated column possible. Copied from the MSDN link (in case the link breaks), start this query in one window:

if object_id('TestTable') is not null
    drop table TestTable
create table TestTable (txt nvarchar(max) not null)
go
insert into TestTable select replicate(convert(varchar(max),
    char(65+abs(checksum(newid()))%26)),100000)
go 10
update TestTable set txt=replicate(convert(varchar(max),
    char(65+abs(checksum(newid()))%26)),100000)
go 100000

This creates a table and then updates it with a string of 100.000x the same character. While the first query is running, start this query in another window:

while 1=1 begin
 if exists (select * from TestTable (nolock) where left(Txt,1) <> right(Txt,1))
    break
end

The second query stops when it reads a column that is half updated. That is, when the first character is different from the last. It will finish quickly, proving that it is possible to read half-updated columns. If you remove the nolock hint, the second query will never finish.

Surprising result! A half-updated XML column might break a (nolock) report, because the XML would be malformed.