Sql-server – SQL Server sometimes uses a lot more cpu for query

performancequery-performanceselectsql serversql-server-2012stored-procedures

I have a stored procedure with a quite heavy SELECT statement that includes five joins and uses FieldName LIKE '%searchterm%' on multiple columns. This query usually finishes in about 2-5 seconds. But sometimes (at seemingly random points in time) the server goes into a state where the query takes ~120 seconds to finish. This state can last from a couple of minutes to several hours. I have indications that you can end the slow state by calling DBCC FREEPROCCACHE, but I'm not completely sure. I'm using SQL Server 11.0.3153.

I've used the profiler tool to trace the server. Most of the time the query requires 900 CPU and 500 000 reads, but in the slow state it's 100 000 CPU and 7.5*10^6 reads. Even when the server is in the slow state it only seems to affect connections made by my .NET application, I can still execute the stored procedure quickly via SQL Server Management Studio.

The query looks something like this:

SET @search = REPLACE(@search, ' ', '%');
SET @search = '%' + @search + '%';

SELECT
    ...
FROM
    Table1 t1,
    Table1 t2
    /* The real query uses 3 more joined tables left out here */
WHERE
    t1.id = t2.t1_id
    AND (
        (t1.col1 LIKE @search)
        OR (t2.col1 LIKE @search)
        OR (t1.id IN (SELECT t1_id FROM Table3 t3 WHERE t3.col1 LIKE @search)
        )
ORDER BY
    SomeCol DESC,
    SomeOtherCol DESC
OFFSET @row_offset ROWS FETCH NEXT @row_count ROWS ONLY

Can anyone tell me what's going on here? What can be the cause of the inconsistent behavior?

Best Answer

That sounds like it is using an inappropriate query plan. There are two obvious paths this query could take and it is presumably deciding between them depending on the values of @row_offset and @row_count: for a small offset it is probably better to scan SomeCol,SomeOtherCol in order and apply the complex filtering clauses (in the WHERE and JOIN ON clauses) to each row until enough have been found to satisfy @row_count. For many other circumstances this would be a terribly bad approach.

To be more specific we'd need details on those other tables, the indexes/keys, and the sort of size/distribution of each table's contents.

What I suspect is happening is a cached query plan is being used when that plan is inappropriate for the current input values. When you run DBCC FREEPROCCACHE the cache is purged so it is forced to create a new plan and use that.

You can try the RECOMPILE hint by appending OPTION(RECOMPILE) to direct SQL and WITH RECOMPILE on the stored procedure definition if it is a procedure rather than an ad-hoc query - this tells SQL Server not to bother checking the procedure cache and always generate a fresh plan. This of course means that you lose the benefit on occasions where using a cached plan instead of having to reassess stats and replan would have been faster - this may be an acceptable tradeoff but obviously you need to benchmark and make sure you are not degrading overall performance more than is acceptable (if so you'll need a different strategy)).