tracking deadlocks is the easier of the two:
By default, deadlocks are not written in the error log. You can cause SQL to write deadlocks to the error log with trace flags 1204 and 3605.
Write deadlock info to the SQL Server error log:
DBCC TRACEON(-1, 1204, 3605)
Turn it off:
DBCC TRACEOFF(-1, 1204, 3605)
See "Troubleshooting Deadlocks" for a discussion of trace flag 1204 and the output you will get when it is turned on.
https://msdn.microsoft.com/en-us/library/ms178104.aspx
Prevention is more difficult, essentially you have to look out for the following:
Code Block 1 locks resource A, then resource B, in that order.
Code Block 2 locks resource B, then resource A, in that order.
This is the classic condition where a deadlock can occur, if the locking of both the resources is not atomic, the Code Block 1 can lock A and be pre-empted, then Code Block 2 locks B before A gets processing time back. Now you have deadlock.
To prevent this condition, you can do something like the following
Code Block A (psuedo code)
Lock Shared Resource Z
Lock Resource A
Lock Resource B
Unlock Shared Resource Z
...
Code Block B (pseudo code)
Lock Shared Resource Z
Lock Resource B
Lock Resource A
Unlock Shared Resource Z
...
not forgetting to unlock A and B when done with them
this would prevent the deadlocking between code block A and code block B
From a database perspective, I'm not sure on how to go about preventing this situation, as locks are handled by the database itself, i.e. row/table locks when updating data. Where I've seen the most issues occur is where you saw yours, inside a cursor. Cursors are notoriously inefficient, avoid them if at all possible.
The IDENTITY generator is not well documented. There are some behaviors however that can be observed that seem relevant:
The identity generation does not get affected by transactions. That means once a value has been used it will not be reused, even if the transaction causing its use is rolled back.
Not every use causes an update of the sequence position being written back to the database. You can see that for example after a crash. Often the next used value after a crash is several numbers higher than the previous.
While there is no proof (meaning documentation), it can be assumed that for performance reasons a multi-row insert grabs a block of identity values and uses them until it runs out. Another concurrent thread will get the next block of numbers. At this point the identity value does not actually reflect the order of inserts anymore.
The rowversion data type on the other hand is an ever increasing number that would reflect insert order. (timestamp is a deprecated synonym for rowversion.)
So in your case you can assume that the rows were inserted in the order of the rowversion column and that the out-of-order identity value is caused by in memory performance optimizations.
By the way, while the IDENTITY generator is not very well documented, the new 2012 SEQUENCE
functionality is. Here you can read all about the behaviors described above in sequences.
As for your concern with replication:
Transactional replication is using the database log and does not rely on specific column values.
Merge replication uses a rowguid column to identify a row. This is a column that gets valued once and does not change throughout the life of the row. Merge replication does not use a rowversion column. Transactional consistency is enforced by the fact that at the time of a synchronization, normal locking is used, so a transaction is either completely visible to the merge agent or completely invisible.
Snapshot replication does not look for changes at all. It just takes the at the time of the synchronization committed data and copies it over.
Best Answer
I do not think that there is a clean solution to add the rowversion value in the same insert statement. Here are some reasons as to why I would think that
You can create a
Binary(8)
column and store theMIN_ACTIVE_ROWVERSION()
in there, but among other issues, multiple inserts will fail.Validating the values
Run an update
Validating
Inserting multiple records is going to fail
Because it is calculated once per set, not per row.
An after insert trigger to update the column
But you would not be able to do that with a primary key constraint unless you have a unique placeholder value.
And, as a side effect of this update, the rowversion column is updated again.
Changing the query to do an insert + Update in would get the same end result as the trigger.
Another option would be Row by row inserts with an instead of trigger.
Based on Listing 9 from this source
Note that I had to use
@@DBTS + 1
here, becauseMIN_ACTIVE_ROWVERSION()
was not deemed active inside the loop.Among other considerations, performance of this will be very slow depending on the amount of inserted records.
When considering these examples, you would have to choose between
Those choises are really not the choices I would like to make.
Not to say that somebody more knowledgable than me might give you a better solution