Sql-server – Deadlock when update different rows with non-clustered index

indexlockingsql server

I'm solving a deadlocking issue while I noticed the lock behavior is different when I use clustered and non-clustered index on the id field. The deadlock issue seems to be solved if clusted index or primary key is applied to the id field.

I have different transactions doing one or more updates to different rows, e.g. transaction A will only update row with ID=a, tx B will only touch row with ID=b etc.

And I have understand that without index, the update will acquire update lock for all the rows and covert to exclusive lock when necessary, which will eventually leads to deadlock. But I fail to find out why with non-clustered index, the deadlock is still there (though hit rate seems to be dropped)

Data table:

CREATE TABLE [dbo].[user](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [userName] [nvarchar](255) NULL,
    [name] [nvarchar](255) NULL,
    [phone] [nvarchar](255) NULL,
    [password] [nvarchar](255) NULL,
    [ip] [nvarchar](30) NULL,
    [email] [nvarchar](255) NULL,
    [pubDate] [datetime] NULL,
    [todoOrder] [text] NULL
)

Deadlock trace

deadlock-list
deadlock victim=process4152ca8
process-list
process id=process4152ca8 taskpriority=0 logused=0 waitresource=RID: 5:1:388:29 waittime=3308 ownerId=252354 transactionname=user_transaction lasttranstarted=2014-04-11T00:15:30.947 XDES=0xb0bf180 lockMode=U schedulerid=3 kpid=11392 status=suspended spid=57 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2014-04-11T00:15:30.953 lastbatchcompleted=2014-04-11T00:15:30.950 lastattention=1900-01-01T00:00:00.950 clientapp=.Net SqlClient Data Provider hostname=BOOD-PC hostpid=9272 loginname=getodo_sql isolationlevel=read committed (2) xactid=252354 currentdb=5 lockTimeout=4294967295 clientoption1=671088672 clientoption2=128056
executionStack
frame procname=adhoc line=1 stmtstart=62 sqlhandle=0x0200000062f45209ccf17a0e76c2389eb409d7d970b0f89e00000000000000000000000000000000
update [user] WITH (ROWLOCK) set [todoOrder]=@para0 where id=@owner
frame procname=unknown line=1 sqlhandle=0x00000000000000000000000000000000000000000000000000000000000000000000000000000000
unknown
inputbuf
(@para0 nvarchar(2)<c/>@owner int)update [user] WITH (ROWLOCK) set [todoOrder]=@para0 where id=@owner
process id=process4153468 taskpriority=0 logused=4652 waitresource=KEY: 5:72057594042187776 (3fc56173665b) waittime=3303 ownerId=252344 transactionname=user_transaction lasttranstarted=2014-04-11T00:15:30.920 XDES=0x4184b78 lockMode=U schedulerid=3 kpid=7272 status=suspended spid=58 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2014-04-11T00:15:30.960 lastbatchcompleted=2014-04-11T00:15:30.960 lastattention=1900-01-01T00:00:00.960 clientapp=.Net SqlClient Data Provider hostname=BOOD-PC hostpid=9272 loginname=getodo_sql isolationlevel=read committed (2) xactid=252344 currentdb=5 lockTimeout=4294967295 clientoption1=671088672 clientoption2=128056
executionStack
frame procname=adhoc line=1 stmtstart=60 sqlhandle=0x02000000d4616f250747930a4cd34716b610a8113cb92fbc00000000000000000000000000000000
update [user] WITH (ROWLOCK) set [todoOrder]=@para0 where id=@uid
frame procname=unknown line=1 sqlhandle=0x00000000000000000000000000000000000000000000000000000000000000000000000000000000
unknown
inputbuf
(@para0 nvarchar(61)<c/>@uid int)update [user] WITH (ROWLOCK) set [todoOrder]=@para0 where id=@uid
resource-list
ridlock fileid=1 pageid=388 dbid=5 objectname=SQL2012_707688_webows.dbo.user id=lock3f7af780 mode=X associatedObjectId=72057594042122240
owner-list
owner id=process4153468 mode=X
waiter-list
waiter id=process4152ca8 mode=U requestType=wait
keylock hobtid=72057594042187776 dbid=5 objectname=SQL2012_707688_webows.dbo.user indexname=10 id=lock3f7ad700 mode=U associatedObjectId=72057594042187776
owner-list
owner id=process4152ca8 mode=U
waiter-list
waiter id=process4153468 mode=U requestType=wait

Also an interesting and possible related finding is that clustered and non-clustered index seems to have different lock behaviors

When use the clustered index, there is an exclusive lock on the key as well as an exclusive lock on RID when do update, which is expected; while there are two exclusive lock on two different RID if non-clustered index is used, which confuses me.

Would be helpful if anyone can explain why on this too.

Test SQL:

use SQL2012_707688_webows;
begin transaction;
update [user] with (rowlock) set todoOrder='{1}' where id = 63501
exec sp_lock;
commit;

With id as Clustered Index:

spid    dbid    ObjId   IndId   Type    Resource    Mode    Status
53  5   917578307   1   KEY (b1a92fe5eed4)                      X   GRANT
53  5   917578307   1   PAG 1:879                               IX  GRANT
53  5   917578307   1   PAG 1:1928                              IX  GRANT
53  5   917578307   1   RID 1:879:7                             X   GRANT

With id as Non-Clustered Index

spid    dbid    ObjId   IndId   Type    Resource    Mode    Status
53  5   917578307   0   PAG 1:879                               IX  GRANT
53  5   917578307   0   PAG 1:1928                              IX  GRANT
53  5   917578307   0   RID 1:879:7                             X   GRANT
53  5   917578307   0   RID 1:1928:18                           X   GRANT

EDIT1 : Details of deadlock without any index
Say I have two tx A and B, each with two update statements, different row of course
tx A

update [user] with (rowlock) set todoOrder='{1}' where id = 63501
update [user] with (rowlock) set todoOrder='{2}' where id = 63501

tx B

update [user] with (rowlock) set todoOrder='{3}' where id = 63502
update [user] with (rowlock) set todoOrder='{4}' where id = 63502

{1} and {4} would have a chance of deadlock, since

at {1}, U lock is requested for row 63502 since it needs to do a table scan,
and X lock could have been hold on row 63501 since it matches the condition

at {4}, U lock is requested for row 63501, and X lock already hold for 63502

so we have txA holds 63501 and waits 63502 while txB holds 63502 waiting for 63501, which is a deadlock

EDIT2 : Turns out a bug of my test case makes a difference situation here
Sorry for confusion but the bug makes a difference situation, and seems to cause the deadlock eventually.

Since Paul's analysis really helped me out in this case so I'll accept that as an answer.

Due to the bug of my test case, two transaction txA and txB can update the same row, as below:

tx A

update [user] with (rowlock) set todoOrder='{1}' where id = 63501
update [user] with (rowlock) set todoOrder='{2}' where id = 63501

tx B

update [user] with (rowlock) set todoOrder='{3}' where id = 63501

{2} and {3} would have a chance of deadlock when:

txA requests U lock on the key while holds X lock on RID (due to update of {1})
txB requests U lock on RID while holds U lock on the key

Best Answer

...why with clustered index, the deadlock is still there (though hit rate seems to be dropped)

The question isn't precisely clear (e.g. how many updates and to which id values are in each transaction) but one obvious deadlock scenario arises with multiple single-row updates within a single transaction, where there is an overlap of [id] values, and the ids are updated in a different [id] order:

[T1]: Update id 2; Update id 1;
[T2]: Update id 1; Update id 2;

Deadlock sequence: T1 (u2), T2 (u1), T1 (u1) wait, T2 (u2) wait.

This deadlock sequence might be avoided by updating strictly in id order within each transaction (acquiring locks in the same order on the same path).

When use the clustered index, there is an exclusive lock on the key as well as an exclusive lock on RID when do update, which is expected; while there are two exclusive lock on two different RID if non-clustered index is used, which confuses me.

With a unique clustered index on id, an exclusive lock is taken on the clustering key to protect writes to the in-row data. A separate RID exclusive lock is required to protect the write to the LOB text column, which is stored on a separate data page by default.

When the table is a heap with only a nonclustered index on id, two things happen. First, one RID exclusive lock relates to the heap in-row data, and the other is the lock on the LOB data as before. The second effect is that a more complex execution plan is required.

With a clustered index and a simple single-value equality predicate update, the query processor can apply an optimization that performs the update (read and write) in a single operator, using a single path:

Single-operator update

The row is located and updated in a single seek operation, requiring only exclusive locks (no update locks are needed). An example locking sequence using your sample table:

acquiring IX lock on OBJECT: 6:992930809:0 -- TABLE
acquiring IX lock on PAGE: 6:1:59104 -- INROW
acquiring X lock on KEY: 6:72057594233618432 (61a06abd401c) -- INROW
acquiring IX lock on PAGE: 6:1:59091 -- LOB
acquiring X lock on RID: 6:1:59091:1 -- LOB

releasing lock reference on PAGE: 6:1:59091 -- LOB
releasing lock reference on RID: 6:1:59091:1 -- LOB
releasing lock reference on KEY: 6:72057594233618432 (61a06abd401c) -- INROW
releasing lock reference on PAGE: 6:1:59104 -- INROW

With only a nonclustered index, the same optimization cannot be applied because we need to read from one b-tree structure and write another. The multi-path plan has separate read and write phases:

Multi-iterator update

This acquires update locks when reading, converting to exclusive locks if the row qualifies. Example lock sequence with the schema given:

acquiring IX lock on OBJECT: 6:992930809:0 -- TABLE
acquiring IU lock on PAGE: 6:1:59105 -- NC INDEX
acquiring U lock on KEY: 6:72057594233749504 (61a06abd401c) -- NC INDEX
acquiring IU lock on PAGE: 6:1:59104 -- HEAP
acquiring U lock on RID: 6:1:59104:1 -- HEAP
acquiring IX lock on PAGE: 6:1:59104 -- HEAP convert to X
acquiring X lock on RID: 6:1:59104:1 -- HEAP convert to X
acquiring IU lock on PAGE: 6:1:59091 -- LOB
acquiring U lock on RID: 6:1:59091:1 -- LOB

releasing lock reference on PAGE: 6:1:59091 
releasing lock reference on RID: 6:1:59091:1
releasing lock reference on RID: 6:1:59104:1
releasing lock reference on PAGE: 6:1:59104 
releasing lock on KEY: 6:72057594233749504 (61a06abd401c)
releasing lock on PAGE: 6:1:59105 

Note the LOB data is read and written at the Table Update iterator. The more complex plan and multiple read and write paths increase the chances of a deadlock.

Finally, I can't help but notice the data types used in the table definition. You should not use the deprecated text data type for new work; the alternative, if you really need the ability to store up to 2GB of data in this column, is varchar(max). One important difference between text and varchar(max) is that text data is stored off-row by default, while varchar(max) stores in-row by default.

Use Unicode types only if you need that flexibility (e.g. it is hard to see why an IP address would need Unicode). Also, choose appropriate length limits for your attributes - 255 everywhere seems unlikely to be correct.

Additional reading:
Deadlock and livelock common patterns
Bart Duncan's deadlock troubleshooting series

Tracing locks can be done in a variety of ways. SQL Server Express with Advanced Services (2014 & 2012 SP1 onward only) contains the Profiler tool, which is a supported way to view the details of lock acquisition and release.