Sql-server – Does SentryOne Plan Explorer count the reads in UDFs

execution-planfunctionsquery-performancesql serversql-server-2008

I have a query like this:

select dbo.fn_complexFunction(t.id)
from mytable t

In SQL Sentry Plan Explorer, I noticed I have to run Get Estimated Plan to make the Query Plan include the UDF.

When running 'Get Actual Plan', it doesn't appear that the logical reads and other metrics include the operations that occur in the UDF. In cases like this, is the only workaround to use Profiler?

Best Answer

The biggest problems we have here are:

Like @JNK says, SQL Server obfuscates away the use of a UDF, and does terrible things with them anyway (like always estimates one row). When you generate an actual plan in SSMS, you don't see its use at all either. Plan Explorer subject to the same limitations because it can only provide the information about a plan that SQL Server provides.
The code relies on different sources for runtime metrics when generating an actual plan. Unfortunately the plan XML does not include function calls, and SQL Server doesn't reveal the I/O incurred by a function when using SET STATISTICS IO ON; either (this is how the Table I/O tab gets populated).

Consider the following view and function against AdventureWorks2012. This is just a silly attempt at returning a random row from the detail table given a random row from the header table - mostly to make sure we generate as much I/O as possible, every time.

CREATE VIEW dbo.myview 
WITH SCHEMABINDING
AS
  SELECT TOP (100000) rowguid, SalesOrderID, n = NEWID() 
    FROM Sales.SalesOrderDetail ORDER BY NEWID();
GO

CREATE FUNCTION dbo.whatever(@SalesOrderID INT)
RETURNS UNIQUEIDENTIFIER
WITH SCHEMABINDING
AS
BEGIN
  RETURN 
  (
    SELECT TOP (1) rowguid FROM dbo.myview 
     WHERE SalesOrderID = @SalesOrderID ORDER BY n
  );
END
GO

What Management Studio Does (and Does Not) Tell You

Take the following query in SSMS:

SET STATISTICS IO ON;

  SELECT TOP (5) SalesOrderID, dbo.whatever(SalesOrderID) 
    FROM Sales.SalesOrderHeader ORDER BY NEWID();

SET STATISTICS IO OFF;

When you estimate a plan, you get a plan for the query and a single plan for the function (not 5, as you might hope):

You don't get any I/O data at all, obviously, since the query wasn't actually executed. Now, generate an actual plan. You get the 5 rows you expected in the results grid, the following plan (which makes absolutely no visible mention of the UDF, except in the XML you can find it as part of the query text and as part of the Scalar Operator):

And the following STATISTICS IO output (which makes absolutely no mention of Sales.SalesOrderDetail, even though we know it had to read from that table):

Table 'SalesOrderHeader'. Scan count 1, logical reads 57, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

What Plan Explorer Tells You

When PE generates an estimated plan for the same query, it knows about the same thing as SSMS. However it does show things in a slightly more intuitive way. For example, the estimated plan for the outer query shows how the output of the function is combined with the output of the query, and it is immediately clear - within a single plan diagram - that there is I/O from both tables:

It also shows the function's plan by itself, which I'm only including for completeness:

Now, let's take a look at an actual plan, which is thousands of times more useful. The downside here is, again, it only has the information that SQL Server decides to show, so it can only expose the graphical plan diagram(s) that SQL Server provides. This isn't a situation where someone decided not to show you something useful; it just doesn't know anything about it based on the plan XML that is provided. In this case, it is just like in SSMS, you can only see the outer query's plan, and it's as if the function isn't being called at all:

The Table I/O tab, too, still relies on the output of STATISTICS IO, which also ignores any of the activity performed in the function call:

However, PE gets the entire call stack for you. I have occasionally heard people ask, "Pffft, when will I ever need the call stack?" Well, you can actually break down the time spent, CPU used, and number of reads (and, for TVFs, number of rows produced) for every single function call:

Unfortunately, you don't have the ability to correlate that back to which table(s) the I/O is coming from (again, because SQL Server doesn't provide that information), and it isn't labeled with the UDF name (because it's captured as an ad hoc statement, not the function call itself). But what it does allow you to see, that Management Studio doesn't, is what a dog your UDF is being. You still have to join some dots, but there are fewer dots and they're closer together.

About Profiler

Finally, I would strongly recommend staying away from Profiler, unless it is to set up a server-side trace that you are going to script and then run outside the scope of any UI tool. Using Profiler against a production system is almost certainly going to cause more problems than it will ever solve. If you want to get this information, please use a server-side trace or extended events, and be sure to filter very wisely. Even without profiler, a trace can impact your server, and retrieving showplans through extended events is not the most efficient thing in the world either.

Example

Consider the T-SQL scalar function below, created in the AdventureWorks sample database, which returns the name of a product given its ID:

CREATE FUNCTION dbo.DumbNameLookup
(
    @ProductID integer
)
RETURNS dbo.Name
AS
BEGIN
    RETURN
    (
        SELECT
            p.Name
        FROM Production.Product AS p
        WHERE
            p.ProductID = @ProductID
    );
END;

Pre-execution plan

A pre-execution plan (estimated plan in SSMS) shows plan information for the parent statement and nested function calls:

-- Pre-execution plan shows main query and nested function call
SET SHOWPLAN_XML ON;
GO
SELECT dbo.DumbNameLookup(1);
GO
SET SHOWPLAN_XML OFF;

SSMS output:

SSMS pre-execution plan

The same XML viewed in SQL Sentry Plan Explorer shows the nested nature of the calls more clearly:

PE pre-execution plan

Post-execution output

SSMS shows details for only the main query when post-execution plan output is requested:

-- Post-execution plan shows main query only
SET STATISTICS XML ON;
SELECT dbo.DumbNameLookup(1);
SET STATISTICS XML OFF;

SSMS post-execution

The performance impact of doing otherwise can be shown using the Showplan XML Statistics Profile Event Class in SQL Server Profiler, using a query that calls the function multiple times (once per input row):

SELECT TOP (5)
    p.ProductID,
    dbo.DumbNameLookup(p.ProductID)
FROM Production.Product AS p;

Profiler output:

Trace output

There are five separate post-execution plans for the function executions, and one for the parent query. The five function plans look like this in the profiler lower pane:

Function plans

The parent query plan is:

Parent plan

Executing the query without the TOP (5) clause results in a full execution plan for each of the 504 rows in the Product table. You can probably see how this would quickly get out of hand with larger tables.

The situation for triggers is reversed. These do not show any pre-execution plan information, but do include a post-execution plan. This reflects the set-based nature of triggers; each is fired once for all rows affected, rather than once per row.

Best Answer

Related Solutions

Sql-server – Does SentryOne Plan Explorer work

Sql-server – Estimated versus actual query plan with function calls

Example

Pre-execution plan

Post-execution output

Related Question