Sql-server – Difference between Seek Predicate and Predicate

execution-planperformancequery-performancesql server 2014

I'm trying to performance tune a query that we have in SQL Server 2014 Enterprise.

I have opened the actual query plan in SQL Sentry Plan Explorer and I can see on one node that it has a Seek Predicate and also a Predicate

What's the difference between Seek Predicate and Predicate?

Note: I can see that there are plenty of problems with this node (e.g. the Estimated vs Actual rows, the residual IO), but the question does not relate to any of that.

Best Answer

Let's throw one million rows into a temp table along with a few columns:

CREATE TABLE #174860 (
PK INT NOT NULL, 
COL1 INT NOT NULL,
COL2 INT NOT NULL,
PRIMARY KEY (PK)
);

INSERT INTO #174860 WITH (TABLOCK)
SELECT RN
, RN % 1000
, RN % 10000
FROM 
(
    SELECT TOP 1000000 ROW_NUMBER () OVER (ORDER BY (SELECT NULL)) RN
    FROM   master..spt_values v1,
           master..spt_values v2
) t;

CREATE INDEX IX_174860_IX ON #174860 (COL1) INCLUDE (COL2);

Here I have a clustered index (by default) on the PK column. There's a nonclustered index on COL1 that has a key column of COL1 and includes COL2.

Consider the following query:

SELECT *
FROM #174860
WHERE PK >= 15000 AND PK < 15005
AND COL2 = 5000;

Here I'm not using BETWEEN because Aaron Bertrand is hanging around this question.

How should SQL Server optimizer that query? Well, I know that the filter on PK will reduce the result set to five rows. SQL server can use the clustered index to jump to those five rows instead of reading through all million rows in the table. However, the clustered index only has the PK column as a key column. Once the row is read into memory we need to apply the filter on COL2. Here, PK is a seek predicate and COL2 is a predicate.

SQL server finds five rows using the seek predicate and further reduces those five rows to one row with the normal predicate.

If I define the clustered index differently:

CREATE TABLE #174860 (
PK INT NOT NULL, 
COL1 INT NOT NULL,
COL2 INT NOT NULL,
PRIMARY KEY (COL2, PK)
);

And run the same query I get different results:

In this case, SQL Server can seek using both columns in the WHERE clause. Exactly one row is read from the table using the key columns.

For one more example consider this query:

SELECT *
FROM #174860
WHERE COL1 = 500
AND COL2 = 3545;

The IX_174860_IX index is a covering index because it contains all of the columns needed for the query. However, only COL1 is a key column. SQL Server can seek with that column to find the 1000 rows with a matching COL1 value. It can further filter down those rows on the COL2 column to reduce the final result set to 0 rows.

Related Solutions

Sql-server – Does SentryOne Plan Explorer count the reads in UDFs

The biggest problems we have here are:

Like @JNK says, SQL Server obfuscates away the use of a UDF, and does terrible things with them anyway (like always estimates one row). When you generate an actual plan in SSMS, you don't see its use at all either. Plan Explorer subject to the same limitations because it can only provide the information about a plan that SQL Server provides.
The code relies on different sources for runtime metrics when generating an actual plan. Unfortunately the plan XML does not include function calls, and SQL Server doesn't reveal the I/O incurred by a function when using SET STATISTICS IO ON; either (this is how the Table I/O tab gets populated).

Consider the following view and function against AdventureWorks2012. This is just a silly attempt at returning a random row from the detail table given a random row from the header table - mostly to make sure we generate as much I/O as possible, every time.

CREATE VIEW dbo.myview 
WITH SCHEMABINDING
AS
  SELECT TOP (100000) rowguid, SalesOrderID, n = NEWID() 
    FROM Sales.SalesOrderDetail ORDER BY NEWID();
GO

CREATE FUNCTION dbo.whatever(@SalesOrderID INT)
RETURNS UNIQUEIDENTIFIER
WITH SCHEMABINDING
AS
BEGIN
  RETURN 
  (
    SELECT TOP (1) rowguid FROM dbo.myview 
     WHERE SalesOrderID = @SalesOrderID ORDER BY n
  );
END
GO

What Management Studio Does (and Does Not) Tell You

Take the following query in SSMS:

SET STATISTICS IO ON;

  SELECT TOP (5) SalesOrderID, dbo.whatever(SalesOrderID) 
    FROM Sales.SalesOrderHeader ORDER BY NEWID();

SET STATISTICS IO OFF;

When you estimate a plan, you get a plan for the query and a single plan for the function (not 5, as you might hope):

You don't get any I/O data at all, obviously, since the query wasn't actually executed. Now, generate an actual plan. You get the 5 rows you expected in the results grid, the following plan (which makes absolutely no visible mention of the UDF, except in the XML you can find it as part of the query text and as part of the Scalar Operator):

And the following STATISTICS IO output (which makes absolutely no mention of Sales.SalesOrderDetail, even though we know it had to read from that table):

Table 'SalesOrderHeader'. Scan count 1, logical reads 57, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

What Plan Explorer Tells You

When PE generates an estimated plan for the same query, it knows about the same thing as SSMS. However it does show things in a slightly more intuitive way. For example, the estimated plan for the outer query shows how the output of the function is combined with the output of the query, and it is immediately clear - within a single plan diagram - that there is I/O from both tables:

It also shows the function's plan by itself, which I'm only including for completeness:

Now, let's take a look at an actual plan, which is thousands of times more useful. The downside here is, again, it only has the information that SQL Server decides to show, so it can only expose the graphical plan diagram(s) that SQL Server provides. This isn't a situation where someone decided not to show you something useful; it just doesn't know anything about it based on the plan XML that is provided. In this case, it is just like in SSMS, you can only see the outer query's plan, and it's as if the function isn't being called at all:

The Table I/O tab, too, still relies on the output of STATISTICS IO, which also ignores any of the activity performed in the function call:

However, PE gets the entire call stack for you. I have occasionally heard people ask, "Pffft, when will I ever need the call stack?" Well, you can actually break down the time spent, CPU used, and number of reads (and, for TVFs, number of rows produced) for every single function call:

Unfortunately, you don't have the ability to correlate that back to which table(s) the I/O is coming from (again, because SQL Server doesn't provide that information), and it isn't labeled with the UDF name (because it's captured as an ad hoc statement, not the function call itself). But what it does allow you to see, that Management Studio doesn't, is what a dog your UDF is being. You still have to join some dots, but there are fewer dots and they're closer together.

About Profiler

Finally, I would strongly recommend staying away from Profiler, unless it is to set up a server-side trace that you are going to script and then run outside the scope of any UI tool. Using Profiler against a production system is almost certainly going to cause more problems than it will ever solve. If you want to get this information, please use a server-side trace or extended events, and be sure to filter very wisely. Even without profiler, a trace can impact your server, and retrieving showplans through extended events is not the most efficient thing in the world either.

Sql-server – Capture Complex Actual Execution Plans

I can't say for certain because I've never dealt with that large a stored procedure but you could try this:

SELECT query_plan
FROM sys.dm_exec_procedure_stats
CROSS APPLY sys.dm_exec_query_plan(plan_handle)
WHERE object_id = object_id('procedure_name')

Then save the output to a text file.

Best Answer

Related Solutions

Sql-server – Does SentryOne Plan Explorer count the reads in UDFs

Sql-server – Capture Complex Actual Execution Plans

Related Question