First, though slightly unrelated to the main question, your MERGE
statement is potentially at risk of errors due to a race condition. The problem, in a nutshell, is that it is possible for multiple concurrent threads to conclude that the target row does not exist, resulting in colliding attempts to insert. The root cause is that it is not possible to take a shared or update lock on a row that does not exist. The solution is to add a hint:
MERGE [dbo].[InverterData] WITH (SERIALIZABLE) AS [TARGET]
The serializable isolation level hint ensures the key range where the row would go is locked. You have a unique index to support range locking, so this hint will not have an adverse effect on locking, you will simply gain protection against this potential race condition.
Main Question
Why are the SELECTs
blocked by the [InsertOrUpdateInverterData] procedure that is only using MERGE
commands?
Under the default locking read committed isolation level, shared (S) locks are taken when reading data, and typically (though not always) released soon after the read is completed. Some shared locks are held to the end of the statement.
A MERGE
statement modifies data, so it will acquire S or update (U) locks when locating the data to change, which are converted to exclusive (X) locks just before performing the actual modification. Both U and X locks must be held to the end of the transaction.
This is true under all isolation levels except the 'optimistic' snapshot isolation (SI) not - to be confused with versioning read committed, also known as read committed snapshot isolation (RCSI).
Nothing in your question shows a session waiting for an S lock being blocked by a session holding a U lock. These locks are compatible. Any blocking is almost certainly being caused by blocking on a held X lock. This can be a bit tricky to capture when a large number of short-term locks are being taken, converted, and released in a short time interval.
The open_tran_count: 1
on the InsertOrUpdateInverterData command is worth investigating. Although the command hadn't been running very long, you should check that you don't have a containing transaction (in the application or higher-level stored procedure) that is unnecessarily long. Best practice is to keep transactions as short as possible. This may be nothing, but you should definitely check.
Potential solution
As Kin suggested in a comment, you could look to enable a row-versioning isolation level (RCSI or SI) on this database. RCSI is the most often used, since it typically does not require as many application changes. Once enabled, the default read committed isolation level uses row versions instead of taking S locks for reads, so S-X blocking is reduced or eliminated. Some operations (e.g. foreign key checks) still acquire S locks under RCSI.
Be aware though that row versions consume tempdb space, broadly speaking proportional to the rate of change activity and the length of transactions. You will need to test your implementation thoroughly under load to understand and plan for the impact of RCSI (or SI) in your case.
If you want to localize your usage of versioning, rather than enabling it for the whole workload, SI might still be a better choice. By using SI for the read transactions, you will avoid the contention between readers and writers, at the cost of readers seeing the version of the row before any concurrent modification started (more correctly, the read operation under SI will always see the committed state of the row at the time the SI transaction started). There is little or no benefit to using SI for the writing transactions, because write locks will still be taken, and you'll need to handle any write conflicts. Unless that is what you want :)
Note: Unlike RCSI (which once enabled applies to all transactions running at read committed), SI has to be explicitly requested using SET TRANSACTION ISOLATION SNAPSHOT;
.
Subtle behaviours that depend on readers blocking writers (including in trigger code!) make testing essential. See my linked article series and Books Online for details. If you do decide on RCSI, be sure to review Data Modifications under Read Committed Snapshot Isolation in particular.
Finally, you should ensure your instance is patched to SQL Server 2008 Service Pack 4.
Best Answer
Scenarios that may explain these symptoms include:
The application code has started a transaction and not committed
The application
SET IMPLICIT_TRANSACTIONS ON
The stored procedure includes
SET IMPLICIT_TRANSACTIONS ON
This DMV query will help troubleshooting. The transaction name value will be "implicit_transaction" if started implicitly. Other possible values include "user_transaction", "DTC Transaction", or a user-specified transaction name. A "user_transaction" value (with no explict proc transaction) indicates the transaction must have been started on the client side and not committed. A value of "DTC Transaction" indicates a distributed transaction, also indicating an issue with transaction management on the app side.
It is a good practice to include
SET XACT_ABORT ON
in stored procedures with explict transactions to ensure transactions are rolled back, especially following a client timeout. That probably won't help here since your proc has no explict transaction, unless the proc happens to raise an error and rollback the erroneous transaction as result.