Sql-server – What are the possible causes for sp_reset_connection taking a long time to execute

sql serversql-server-2008

Why would the sp_reset_connection system stored procedure be taking longer than a few milliseconds to execute, as viewed through SQL Server Profiler?

I took a simple trace from a production system using SQL Server Profiler and then used SqlNexus to analyze it. SqlNexus indicates that sp_reset_connection has the highest cumulative duration — 33% of the overall trace. The observed duration ranges from 0-7 seconds (12 to 6,833,270 microseconds) but averages at 0.956s.

I understand that sp_reset_connection is being called when a pooled connection gets reused. I have seen a suggestion that this can be happening due to extraneous traces, but that doesn't seem to be the case.

I have read what the server is doing when the sproc is called but I don't believe any of those would be problematic in this case — the code is not leaving open transaction or huge temporary tables that would need to be cleaned up.

I also looked at https://serverfault.com/questions/199974/sp-reset-connection-taking-a-long-time-to-run but it wasn't helpful.

EDIT (2013-12-23):
In all cases, reads and writes are 0 and CPU is almost always 0 (only two instances of non-zero CPU, both at 16ms).

Best Answer

Finally got some time to write a more detailed answer.

There are typically three main reasons a simple procedure like sp_reset_connection will take a long time to run.

  1. You are waiting for CPU resources
  2. You are blocked on a lock somewhere (perhaps as a result of DML or a competing transaction)
  3. Your network is slow and it takes a long time to return the result back to the client

Ad 1) If you are waiting for CPU resources, this should show up as signal waits. Please see my comment on your question on how to diagnose if this is the issue

Ad 2) If you are waiting for a lock, this is best diagnosed by comparing two snapshots of sys.dm_os_wait_stats. See this article on how to do this:

If you see long waits for LCK_[Something], query sys.dm_tran_locks to track down which objects are being locked. In your case, I would expect to see some form of SCH-[Something]> locks blocking you.

Ad 3) The easiest way to diagnose network issues to first look for OLEDB and ASYNC_NETWORK_IO waits in step 2 (if you wait a long time for network, one of those show up). If those waits are high, use xperf -on latency or a network monitoring program like netmon or wireshark to check your latencies. If the network looks slow, this could also be caused by the calling application server not responding fast enough to the connection being recycled.