Sql-server – PageIOLatch_Sh waits noticed after tuning an index

sql serversql-server-2005wait-types

Lately, a lot of queries to the database are timing out. While investigating I found that queries are in the suspended state with wait_type as PageIOLatch_Sh. I have not noticed so many queries being suspended before on the server. This has been happening for the since Saturday.

A couple of weeks ago, the cpu on the db server was at 100% constantly. While trying to fix that I changed a couple of indexes which were not really being used and were resulting in user_scans. After changing the indexes, the cpu usage on the server went down, the performance of the site was generally faster and looking at the index stats now a high number of user_seeks were happening, which I thought was a good sign.

However, now after a week or two, even though the cpu usage is not at 100, a large number of queries have started timing out and PageIoLatch_Sh waits are being observed. The interesting thing is that these timeouts were not happening at such a high frequency earlier even though the cpu was at 100%.

Now maintaining indexes is also something which slows down processing of sql server, so I was wondering whether the new indexes could have resulted in a larger number of these waits and timeouts?

Best Answer

From the background information you have provided it sounds like you have moved the contention point on your server from CPU to IO, as a result of implementing an indexing strategy targeted at specific troublesome queries.

For the most part (without getting into the specifics of what exactly your indexing strategy is) this is expected behaviour because with CPU no longer being a contention point, WAITS are naturally occurring elsewhere.

The performance issues you are seeing now are not necessarily caused by the index modifications you have applied. Further investigation into identifying the specific queries that are responsible for the IO waits/as well as the timeouts that you are seeing is required. You want to review the execution plans for each of these scenarios to identify possible areas for optimization (look for typically undesirable operations first such as implicit conversions, table scans etc.).

Not wanting to be presumptuous however I often encounter query timeouts on customer environments due to poor query plan choices as a result of parameter sniffing. This may also be something to look into.

For some handy queries to use during you investigations take a look at Glenn Berry's SQL Server Diagnostic Queries. There are a number of fantastic queries in there for looking at IO, for example which databases are responsible for the most IO, which database data files are seeing the most activity, which queries cause most reads/writes etc.