SQL Server Performance Monitoring – Understanding Page Lookups Percentage

performancesql server

I'm currently doing some work on monitoring SQL Server performance. I've found the following script[1] which calculates 'Page Lookups Percentage' – the suggestion being that a "good" value is something less than 100.

SELECT 1.0*cntr_value /
(SELECT 1.0*cntr_value
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Batch Requests/sec') 
AS [PageLookupPct]
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Page lookups/sec'

Now I've run this script on about a dozen different servers – mostly virtual but a couple physical, mostly 2008R2 Enterprise but a couple of 2005, some live and some development, and even a physical 2008R2 Express edition that has only ever been a witness server and is currently dormant (no mirrors set up with active failover). On only one of these servers has the value been below 100. In most cases it has been in the 1000's or at least in the mid 100's. On the dormant witness server the value was a ridiculous 20,000 !

Does this indicate that we have widespread problems in this area, or is there some innocent explanation for why this figure should be so high on so many different servers? We are experiencing slow performance on some of these servers, so it would be useful for me to get a better understanding of what this figure is telling me.

[1] – http://thomaslarock.com/2012/05/are-you-using-the-right-sql-server-performance-metrics/

Best Answer

The metric is trying to measure the number of logical IO operations performed by each request, in average. If your server is, lets say, serving a busy web site, then is probably a good metric (perhaps 100 is even too high for such!).

The witness will only have internal activity. In other words it will touch data files internally, due to various background processes, despite never seeing any actual batch. As such all internal logical IO will be reported as caused by the few batches this server has seen.

I would say that the metric Thomas is recommending has some value but only in a a workload similar to what Thomas had in mind. A server that has only background activity, like your witness, is not going to report anything meaningful with this ratio. A server that runs intensive DW reports may see only 1 batch request per hour and run millions of page lookups for that one request. An absolute number like 'must be below 100' is only valuable accompanied by smallprint disclaimers like must be bellow 100 if your workload has frequent enough batch request to establish significant statistical trends and each batch requests is expected to perform under 100 page lookups in average. You get the idea...