From the background information you have provided it sounds like you have moved the contention point on your server from CPU to IO, as a result of implementing an indexing strategy targeted at specific troublesome queries.
For the most part (without getting into the specifics of what exactly your indexing strategy is) this is expected behaviour because with CPU no longer being a contention point, WAITS are naturally occurring elsewhere.
The performance issues you are seeing now are not necessarily caused by the index modifications you have applied. Further investigation into identifying the specific queries that are responsible for the IO waits/as well as the timeouts that you are seeing is required. You want to review the execution plans for each of these scenarios to identify possible areas for optimization (look for typically undesirable operations first such as implicit conversions, table scans etc.).
Not wanting to be presumptuous however I often encounter query timeouts on customer environments due to poor query plan choices as a result of parameter sniffing. This may also be something to look into.
For some handy queries to use during you investigations take a look at Glenn Berry's SQL Server Diagnostic Queries. There are a number of fantastic queries in there for looking at IO, for example which databases are responsible for the most IO, which database data files are seeing the most activity, which queries cause most reads/writes etc.
Well from an application point of view there are:
- connection timeout (how long the app is willing to wait to establish a connection to SQL Server)
- command timeout (how long the app is willing to wait for a command to complete, including pulling the results down from SQL Server)
Back in my classic ASP days, the defaults for these were 15 and 30 seconds respectively, I have no idea what they are by default in .NET today.
SQL Server has its own set of timeouts, for example:
- Remote query timeout. Default is 600 seconds (10 minutes).
- Remote login timeout. Default is 10 seconds.
- Query wait. Default is -1 (25 x query cost).
- Full-text protocol handler timeout. Default is 60 seconds.
You can see these values for your system here:
SELECT * FROM sys.configurations
WHERE configuration_id IN (1519,1520,1541,1557);
There is also @@LOCK_TIMEOUT
(which defaults to -1 (infinity)). This is how long SQL Server will wait on a blocked resource. You can override this for a particular session using SET LOCK_TIMEOUT
. More details here.
Deadlocks I suppose could also fall into this category as well. The system scans for deadlock situations every 5 seconds, and there is no magic formula to determine when the deadlock will occur in relation to when any of the involved requests started. This is because SQL Server doesn't let the oldest transaction win; it chooses the victim based on DEADLOCK_PRIORITY and the estimated amount of resources required to roll the victim back. More details here.
There is also a memory grant timeout (which may be customized using Resource Governor). Depending on concurrency, a query won't necessarily fail if it reaches the timeout before obtaining all of the memory requested, it will just run with the allocated amount (and therefore might be less efficient). If it fails you will likely see Msg 8645.
You can get an idea for other potential timeout scenarios that may occur within SQL Server by reviewing these error messages:
SELECT message_id, [text]
FROM sys.messages
WHERE language_id = 1033
AND ([text] LIKE '%timeout%' OR [text] LIKE '%time out%')
However I don't think it is practical, feasible or productive for anyone to try to provide you with a complete and exhaustive list of every single timeout situation possible. Solve the problems you're having, rather than prematurely solving a whole bunch of problems you probably never will...
Best Answer
This looks like you are mixing up counts and duration lock waits/sec is a count of events per second average wait time - is the duration of the sum of the lock waits / the number of waits in that second.
so the 2 indicators are only correlated if an increase in the number of waits leads to an equivalent increase in the duration of these waits - but there is no reason for this to be true, Also worth noting that there are many different sorts of lock within these numbers - row, page, extent, table etc