I have a SQL query that I have spent the past two days trying to optimise using trial-and-error and the execution plan, but to no avail. Please forgive me for doing this but I will post the entire execution plan here. I have made the effort to make the table and column names in the query and execution plan generic both for brevity and to protect my company's IP. The execution plan can be opened with SQL Sentry Plan Explorer.
I have done a fair amount of T-SQL, but using execution plans to optimise my query is a new area for me and I have really tried to understand how to do it. So, if anyone could help me with this and explain how this execution plan can be deciphered to find ways in the query to optimise it, I would be eternally grateful. I have many more queries to optimise – I just need a springboard to help me with this first one.
This is the query:
DECLARE @Param0 DATETIME = '2013-07-29';
DECLARE @Param1 INT = CONVERT(INT, CONVERT(VARCHAR, @Param0, 112))
DECLARE @Param2 VARCHAR(50) = 'ABC';
DECLARE @Param3 VARCHAR(100) = 'DEF';
DECLARE @Param4 VARCHAR(50) = 'XYZ';
DECLARE @Param5 VARCHAR(100) = NULL;
DECLARE @Param6 VARCHAR(50) = 'Text3';
SET NOCOUNT ON
DECLARE @MyTableVar TABLE
(
B_Var1_PK int,
Job_Var1 varchar(512),
Job_Var2 varchar(50)
)
INSERT INTO @MyTableVar (B_Var1_PK, Job_Var1, Job_Var2)
SELECT B_Var1_PK, Job_Var1, Job_Var2 FROM [fn_GetJobs] (@Param1, @Param2, @Param3, @Param4, @Param6);
CREATE TABLE #TempTable
(
TTVar1_PK INT PRIMARY KEY,
TTVar2_LK VARCHAR(100),
TTVar3_LK VARCHAR(50),
TTVar4_LK INT,
TTVar5 VARCHAR(20)
);
INSERT INTO #TempTable
SELECT DISTINCT
T.T1_PK,
T.T1_Var1_LK,
T.T1_Var2_LK,
MAX(T.T1_Var3_LK),
T.T1_Var4_LK
FROM
MyTable1 T
INNER JOIN feeds.MyTable2 A ON A.T2_Var1 = T.T1_Var4_LK
INNER JOIN @MyTableVar B ON B.Job_Var2 = A.T2_Var2 AND B.Job_Var1 = A.T2_Var3
GROUP BY T.T1_PK, T.T1_Var1_LK, T.T1_Var2_LK, T.T1_Var4_LK
-- This is the slow statement...
SELECT
CASE E.E_Var1_LK
WHEN 'Text1' THEN T.TTVar2_LK + '_' + F.F_Var1
WHEN 'Text2' THEN T.TTVar2_LK + '_' + F.F_Var2
WHEN 'Text3' THEN T.TTVar2_LK
END,
T.TTVar4_LK,
T.TTVar3_LK,
CASE E.E_Var1_LK
WHEN 'Text1' THEN F.F_Var1
WHEN 'Text2' THEN F.F_Var2
WHEN 'Text3' THEN T.TTVar5
END,
A.A_Var3_FK_LK,
C.C_Var1_PK,
SUM(CONVERT(DECIMAL(18,4), A.A_Var1) + CONVERT(DECIMAL(18,4), A.A_Var2))
FROM #TempTable T
INNER JOIN TableA (NOLOCK) A ON A.A_Var4_FK_LK = T.TTVar1_PK
INNER JOIN @MyTableVar B ON B.B_Var1_PK = A.Job
INNER JOIN TableC (NOLOCK) C ON C.C_Var2_PK = A.A_Var5_FK_LK
INNER JOIN TableD (NOLOCK) D ON D.D_Var1_PK = A.A_Var6_FK_LK
INNER JOIN TableE (NOLOCK) E ON E.E_Var1_PK = A.A_Var7_FK_LK
LEFT OUTER JOIN feeds.TableF (NOLOCK) F ON F.F_Var1 = T.TTVar5
WHERE A.A_Var8_FK_LK = @Param1
GROUP BY
CASE E.E_Var1_LK
WHEN 'Text1' THEN T.TTVar2_LK + '_' + F.F_Var1
WHEN 'Text2' THEN T.TTVar2_LK + '_' + F.F_Var2
WHEN 'Text3' THEN T.TTVar2_LK
END,
T.TTVar4_LK,
T.TTVar3_LK,
CASE E.E_Var1_LK
WHEN 'Text1' THEN F.F_Var1
WHEN 'Text2' THEN F.F_Var2
WHEN 'Text3' THEN T.TTVar5
END,
A.A_Var3_FK_LK,
C.C_Var1_PK
IF OBJECT_ID(N'tempdb..#TempTable') IS NOT NULL
BEGIN
DROP TABLE #TempTable
END
IF OBJECT_ID(N'tempdb..#TempTable') IS NOT NULL
BEGIN
DROP TABLE #TempTable
END
What I have found is that the third statement (commented as being slow) is the part that is taking the most time. The two statements before return almost instantly.
The execution plan is available as XML at this link.
Better to right-click and save and then open in SQL Sentry Plan Explorer or some other viewing software rather than opening in your browser.
If you need any more information from me about the tables or data, please don't hesitate to ask.
Best Answer
Before getting to the main answer, there are two pieces of software you need to update.
Required Software Updates
The first is SQL Server. You are running SQL Server 2008 Service Pack 1 (build 2531). You ought to be patched up to at least the current Service Pack (SQL Server 2008 Service Pack 3 - build 5500). The most recent build of SQL Server 2008 at the time of writing is Service Pack 3, Cumulative Update 12 (build 5844).
The second piece of software is SQL Sentry Plan Explorer. The latest versions have significant new features and fixes, including the ability to directly upload a query plan for expert analysis (no need to paste XML anywhere!)
Query Plan Analysis
The cardinality estimate for the table variable is exactly right, thanks to a statement-level recompilation:
Unfortunately, table variables do not maintain distribution statistics, so all the optimizer knows is that there are six rows; it knows nothing of the values that might be in those six rows. This information is crucial given that the next operation is a join to another table. The cardinality estimate from that join is based on a wild guess by the optimizer:
From that point on, the plan chosen by the optimizer is based on incorrect information, so it's no wonder really that performance is so poor. In particular, the memory set aside for sorts and hash tables for hash joins will be far too small. At execution time, the overflowing sorts and hashing operations will be spilled out to physical tempdb disk.
SQL Server 2008 does not highlight this in execution plans; you can monitor the spills using Extended Events or Profiler Sort Warnings and Hash Warnings. Memory is reserved for sorts and hashes based on cardinality estimates before execution starts, and cannot be increased during execution regardless of how much spare memory your SQL Server may have. Accurate row count estimates are therefore crucial for any execution plan that involves workspace memory consuming operations.
Your query is also parameterized. You should consider adding
OPTION (RECOMPILE)
to the query if different parameter values affect the query plan. You should probably consider using it anyway, so the optimizer can see the value of@Param1
at compilation time. If nothing else, this may help the optimizer produce a more reasonable estimate for the index seek shown above, given that the table is very large, and partitioned. It may also enable static partition elimination.Try the query again with a temporary table instead of the table variable and
OPTION (RECOMPILE)
. You should also try materializing the result of the first join into another temporary table, and run the rest of the query against that. The number of rows is not all that large (3,285,620) so this should be reasonably quick. The optimizer will then have an exact cardinality estimate and distribution statistics for the result of the join. With luck, the rest of the plan will fall into place nicely.Working from the properties shown in the plan, the materializing query would be:
You could also
INSERT
into a predefined temporary table (the correct data types are not shown in the plan, so I cannot do that part). The new temporary table may or may not benefit from clustered and nonclustered indexes.