Sql-server – intermittent performance issue – query with multiple CTE and trashed execution plan

execution-planperformancequery-performancesql-server-2008-r2

I'm trying to optimize a SQL query that shows variable performance (less than 2 to more than 15s).

Overview

  • This is a ColdFusion website, with a SQL Server 2008 R2 backend
  • The query is part of the website search engine – so used quite a bit.

This is a simplified view of the query (some JOINs and WHERE were left out)

with AllEligibleEntries AS (
    SELECT DISTINCT b.*,
        c.*,
        DENSE_RANK() OVER (ORDER BY b.CreationDate DESC) AS RowNum  
    From Ticket b
    LEFT JOIN Customer c ON b.CustomerID = c.CustomerID         -- CustomerID is a string, and NOT the primary key
    Where b.TicketID LIKE @P1                                   -- TicketID is a string, and NOT the primary key
        -- various other conditions here, some always present, other added as required by coldfusion through <cfif ...> AND ... </cfif>
), 
NumberOfEligibleEntries AS (
    select MAX(AllEligibleEntries.RowNum) AS NbRows
    from AllEligibleEntries
),
SelectedEntries AS (
    Select AllEligibleEntries.*
    From AllEligibleEntries
    WHERE AllEligibleEntries.RowNum BETWEEN @P2 AND @P3         -- used to set the number of results per page and the page to display
                                                                -- in all testcases, @P2 is 1 and @P3 is 50, but up to 1000 in production
),
MostRecentAction AS (                                           -- "Greatest-N-Per-Group" - [0]
    Select b1.Ticket, b1.StateID
    FROM TicketEvents b1
    LEFT OUTER JOIN TicketEvents b2
      ON b1.Ticket = b2.Ticket
         AND b1.TicketEventID < b2.TicketEventID
    WHERE b2.Ticket IS NULL
      AND b1.Ticket IN (Select Ticket from SelectedEntries)
),
CompletedTickets AS (
    Select TicketEventID, Ticket, TimeStamp
    From TicketEvents
    Where StateID = 5                                           -- this is the "completed" state
      AND Ticket IN (Select Ticket from SelectedEntries)
),
CompletionDate AS (     
    Select b1.Ticket, b1.TimeStamp
    FROM CompletedTickets b1
    LEFT OUTER JOIN CompletedTickets b2
      ON b1.Ticket = b2.Ticket
         AND b1.TicketEventID < b2.TicketEventID
    WHERE b2.Ticket IS NULL
)
Select SelectedEntries.*, NumberOfEligibleEntries.NbRows,
    d.TimeStamp AS CompletionDate
From SelectedEntries
left join NumberOfEligibleEntries on 1 = 1                      -- ugly hack, so the total number of hits can be displayed in the search result page
left join MostRecentAction a on SelectedEntries.Ticket = a.Ticket
left join CompletionDate d on SelectedEntries.Ticket = d.Ticket
order by SelectedEntries.RowNum

I'm aware of two potential performance issues with the way this query is written:

  1. Trashed execution plan

Using the same approach than ColdFusion (i.e. sp_prepexec, then sp_execute), I sp_prepexec'ed the query with a value for @P1 that would cause AllEligibleEntries to contain no results, then sp_execute'd it with another @P1 chosen so AllEligibleEntries would contain 49 rows. The execution plan of the second run will be posted to sqlperformance.com as soon as I find why I get an error trying to upload it, but is thouroughly wrong (>2000 executions of index seeks on > 6000 rows when two runs on two rows were expected, and many others).

I've confirmed that using option(recompile) (hinted by [1]) fixes this, but that seems much more like a patch than a fix.

Question: what are the alternatives to option(recompile) in a case like this?

  1. Multiple execution of queries

While investigating this issue, I read [2] that CTE are only a nice way to write queries, but in the end, they're substituted when used, and therefore can be executed multiple times. If this is the case (I haven't confirmed it yet), the original AllEligibleEntries CTE would have to be executed something like 4 times.

Question: if this is the case, are there options to ensure that the CTE are only executed once? (I guess that the answer is temporary tables, but this too feels more like a patch than a fix.)

Question: are there other major performance issues I didn't spot?

Thanks

[0] https://stackoverflow.com/questions/tagged/greatest-n-per-group

[1] How to optimise T-SQL query using Execution Plan

[2] https://social.msdn.microsoft.com/Forums/sqlserver/en-US/d9f9e394-97e6-49af-9e2f-aab62165d760/temp-table-vs-cte-vs-derived-table?forum=transactsql

Best Answer

For now, the issue has been addressed by:

  • Converting AllEligibleEntries to a temporary table
  • Converting NumberOfEligibleEntries to a local variable, obtained from a select on the temporary table
  • Converting SelectedEntries to a DELETE WHERE NOT BETWEEN operation on the temporary table
  • Keeping the remaining CTEs as-is.

Performance is measurably better (a few 100ms) than the option(recompile) approach, and the (now multiple) execution plans are cumulatively much simpler.

I'm still interested in feedback on this issue, although I may not revisit the code soon.

Thanks.