Sql-server – Query turned into a CPU gobbling monster

performancequery-performancesql serversql server 2014

I have a SP who's execution time went from 5 minutes > 20 minutes > 30 minutes > 53 minutes over four days

Waits were showing increasing CPU and suspended status

I isolated a single query that pegs the CPU

UPDATE thing.table
SET YYYYMM = 
CASE 
  WHEN 
    DAY(SnapshotDate) = 1
    OR 
  SnapshotDate = (SELECT MAX(SnapshotDate) FROM thing.table) 
  THEN CAST(FORMAT(DATEADD(day,-1,snapshotdate),'yyyyMM') AS INT)
  ELSE NULL
END

I ran it again, adding WITH (RECOMPILE) at the end – no difference

I ran UPDATE STATISTICS thing.table – no difference

It would be interesting to run it and get the actual plan but I don't want to peg the CPU for an hour. I checked sys.dm_exec_cached_plans but it appears to only have the estimated plan not the actual plan

I rewrote using CONVERT instead of FORMAT (because I am suspicious of new things) – no difference

So I rewrote like this and took execution back down to a few seconds:

BEGIN TRAN;

UPDATE thing.table
SET YYYYMM = NULL;

UPDATE thing.table
SET YYYYMM = CAST(FORMAT(DATEADD(day,-1,snapshotdate),'yyyyMM') AS INT)
WHERE 
(
DAY(SnapshotDate) = 1 
OR
SnapshotDate = (SELECT MAX(SnapshotDate) FROM thing.table) 
);

COMMIT TRAN;

The table has about 150,000 records in it. It's quite possible that it recently had a lot more records dumped in it, skewing statistics, but why would WITH(RECOMPILE) and UPDATE STATISTICS not fix that?. It takes a daily snapshot and possibly the number of records increased due to end of month.

So the questions are:

  1. Is actual query plan stored anywhere?, to save me running it interactively?
  2. Normally when a query suddenly takes forever it stale statistics but that doesn't seem to be the case here

This is my version of SQL Server

Microsoft SQL Server 2014 – 12.0.4100.1 (X64) Apr 20 2015 17:29:27 Copyright (c) Microsoft Corporation Standard Edition (64-bit) on Windows NT 6.3 (Build 9600: ) (Hypervisor)

Here are the slow and fast query plans. No suprise they are different because they are doing different things:

Slow Plan:

Slow Query Plan

Fast Plan:

Fast Query Plan

I notice slowpoke uses an loop join and fasty uses a hash match.

I notice the small leg of the loop join has filter

[Expr1006]=DB.thing.table.[SnapshotDate]. Maybe that wasn't so small anymore?

Best Answer

The first query is so slow because it will do a full table scan on thing.table for every row in that table for which DAY(SnapshotDate) <> 1. So if you have 100k rows in the table in the worst case you'll do 100k scans which means reading through 10 billion rows. If the table is small enough it'll stay in memory so your parallel query will appear to burn through CPU.

You can tell by looking at the query plan carefully. The scan is on the inner side of a nested loop join. If that's not your cup of tea you can try live query statistics to see the query as it executes. That way you can get some of the information from the actual plan without needing the query to finish. There's no way to save off old actual plans without setting up extended events.

The second query is faster because the query optimizer is freer to rearrange the elements of the query due to the lack of the CASE expression. Instead of calculating the subquery for MAX(SnapshotDate) once per row the calculation is done once per query.

You'll definitely want to fix this query in some way or the execution time will continue to grow quadratically with the number of rows in the table. One workaround would be to add an index to the SnapshotDate column. The subquery will still execute once for each row but getting the maximum value will be a very cheap operation. A better way is to save off the value of the subquery to a local variable and to use that in your UPDATE query. Unless you have to worry about concurrency that shouldn't be an issue.

You can also stick to the fix that you found if you want. One suggestion that can help in some cases (depending on the table structure) would be to add a where clause to your first UPDATE:

UPDATE thing.table
SET YYYYMM = NULL
WHERE YYYYMM IS NOT NULL;