Sql-server – Poor query performance

index-statisticsindex-tuningparametersql serversql-server-2008-r2

We have a large (10,000+ lines) procedure that typically runs in 0.5-6.0 seconds depending on how much data it has to work with. Over the past month or so it has started taking 30+ seconds after we do a statistics update with FULLSCAN. When it slows down, a sp_recompile "fixes" the issue, until the nightly statistics job runs again.

By comparing the slow and fast execution plans, I have narrowed it down to a specific table/index. When it runs slow it is estimating ~300 rows will be returned from a specific index, when it runs fast it estimates 1 row. When it runs slow it uses a Table Spool after doing a seek on the index, when it runs fast it doesn't do the Table Spool.

Using DBSS SHOW_STATISTICS, I graphed out the index histogram in excel. I would normally expect the graph to be more "rolling hills", but instead, it looks like a mountain, the highest point being 2x-3x higher than most other values on the graph.

Index Histogram

If I update statistics on it, without FULLSCAN, it looks more normal. If I then run it with FULLSCAN again it looks like I described above.

This feels like a parameter sniffing issue, and specifically related to the (seemingly) weird index distribution above.

The proc takes in a table valued parameter, can parameter sniffing occur on a table valued parameter?

EDIT: The proc also takes 12 other parameters, some of which are optional, two of which are a start and end date.

Is the histogram odd, or am I barking up the wrong tree?

I am certainly comfortable trying to adjust the query and/or try to adjust my indexing. If that is the fix that is great, at that point my question is more about the skewed histogram.

I should mention that this is a PK IDENTITY clustered index. We have two systems that talk to each other, one a legacy system, one a new home-grown system. Both systems store similar data. To keep them in sync the PK on this table in the new system is incremented when things are added to the old system, even if the data doesn't come over (a RESEED is done). So there could be some gaps in the numbering in this column. Records are rarely, if ever, deleted.

Any thoughts would be greatly appreciated. I am more than happy to gather/include more info.

Best Answer

This ended up being related to parameter sniffing. It just so happened that some oddly formed versions of this query were being executed RIGHT AFTER the stats were rebuilt. So the cached plan was not representative of the majority of the calls. I used the trick of copying the date parameters to local variables and this is working just fine, with little to no impact on performance. This doesn't answer why the histogram looks so "off", but it does explain my performance issues.

Related Solutions

Sql-server – Query Performance Issue

You say

Above plan is for subid = 11 or 7 in @t table variable

I think you may be under a misapprehension here. SQL Server does not look at the contents of the table variable and choose a plan based upon the values it contains.

The statement is compiled before the table variable contains any rows at all and you will get the same plan (that assumes a single row) regardless of whether it eventually contains 2 (and would match 95.5% of the rows) or 1 (and would match only 0.0008%).

The table variable may of course also contain multiple rows but SQL Server will not take account of that except if you use the OPTION (RECOMPILE) hint and even then there are no statistics on table variables so it cannot take any account of actual values.

Some alternate plans are below

plan 1

plan 2

These require finding all matching rows and sorting them.

Because NCx_1 is not declared as a unique index the include(QueueItemID) is ignored (as explained in More About Nonclustered Index Keys) and QueueItemID gets added as an index key column instead. This means that SQL Server can seek on IsProcessed, QCode and the matching rows will be ordered by QueueItemID.

The plan in your question therefore avoids a sort operation but performance is entirely reliant upon how many rows in practice need to be evaluated before the first one matching the SubID IN (SELECT SubID FROM @t) predicate is found and the range seek can stop.

Of course this can vary wildly depending on how common the SubID values contained in @t are and whether there is any skew in the distribution of these values with respect to QueueItemID (You say that both around 350k rows match the seek predicate and that around 350k end up being seeked so for SubID = 7 it sounds like these are all at the end or perhaps no rows match at all - which would be the worse case for this plan).

It would be interesting to know what the estimated number of rows coming out of the seek is. Presumably this is much less than 350,000 and thus SQL Server chooses the plan you see based on this estimated cost.

If the table variable will always just have few rows you might find this rewrite works better for you.

SELECT TOP 1 QueueItemID
FROM   @t
       CROSS APPLY (SELECT TOP 1 t.QueueItemID
                    FROM   QueueTable t
                    WHERE  t.IsProcessed = 0
                           AND t.QCode = 'USA'
                           AND SubID = [@t].SubID
                    ORDER  BY t.QueueItemID) CA 
ORDER BY QueueItemID

For me it gives the plan below where it seeks into the index on subid,isprocessed,qcode,queueitemid as many times as you have rows in the table variable. It is similar to the first plan shown but may be slightly more efficient as each seek stops after the first row is returned.

plan

Sql-server – statistics are up to date, but estimate is incorrect

There is a simple solution to this:

Drop all of the _dta_... statistics and stop blindly applying DTA recommendations.

More information

The particular problem was that there were multiple sets of statistics for the column in question. The extra dta statistics were created by sampling the data (the default behaviour for statistics not associated with an index).

As is often the case with sampled statistics, the resulting histograms did not cover the full range of the unerlying data. The query in the question happened to choose a value that was outside the histogram, resulting in a 1-row estimate.

The exact behaviour of the query optimizer when multiple sets of statistics exist for the same column is not fully documented. It does tend to prefer 'full scan' statistics over sampled, but it also prefers more recently-updated statistics to older ones.

Best Answer

Related Solutions

Sql-server – Query Performance Issue

Sql-server – statistics are up to date, but estimate is incorrect

Related Question