Sql-server – Reading 300% rows – Problematic execution plan

execution-plansql serverssisssmst-sql

Whole queryplan and query: https://www.brentozar.com/pastetheplan/?id=BkgbANxN4

I am struggling with a query that uses a lot of time.
The obvious reason for this is wrong estimates of amount of rows.
I have tried indexing and updated stats, however I do not seem to address the real problem since the query still goes amazingly slow (I guess wrong index or stats).

What I need are help with interpreting this information so I can help the query to be optimized. What index, which DBCC functions or other built in functions should I run between executions to ensure clean cache and whatnot to avoid using wrong stats and the fresh new one?

Examples would be

DBCC FREEPROCCACHE
UPDATE STATISTICS *table*

Another bottleneck of the same query would be these hash matches. The Hash Match at the right most are finished within seconds, while the one to the left seems to struggle with the last 213 rows that uses many minutes. What can I do to find out where the problem lies in these hash matches?

I am also trying to solve multiple memory spills in our batch jobs, where I can only seem to optimize memory spill from a single table.

I have multiple sorts and hash matches where there are rather long probes and residuals including multiple tables or "Expressions" which I assume are either aggregated or set by SSIS packages.

Should I solve the 'first' or 'last' spills first? First are 'top of the three' and last would be closest to leaf nodes (operators). I do also wonder about some operators, mentioned below.

Can you explain what the terms mean regarding the execution plan descriptions:

Build residual
Probe residual
Hash keys probe

I believe I understand the following terms:

Order by: The order of which the operator needs the data in

Output list: Which data the operator are retrieving for output

Best Answer

That’s a lot of questions.

I write about Hash Match with Probe Residual here: http://blogs.lobsterpot.com.au/2011/03/22/probe-residual-when-you-have-a-hash-match-a-hidden-cost-in-execution-plans/ - there’s a good chance the problem with your Hash Match finishing is more related to what’s pulling the rows from it than the Hash itself.

As for your stats, the problem is all your ORs. You may find your Seek works better with an index that has KVID_Kontotype as the first key, followed by PeriodeStartDato. Then instead of having a large range scan across all dates earlier than your predicate and checking every single row to see if the Kontotype is correct, it would seek for each Kontotype with the appropriate date range. It would probably estimate better, but fundamentally would read fewer rows to get what you need.

Related Solutions

Sql-server – Why are queries causing spill to tempdb

There's several different questions in here:

Q: Why weren't the queries spilling before?

They were, but SQL Server Management Studio didn't surface this as a clear error prior to SQL 2012. It's a great example of why when you're doing performance tuning, you have to go deeper than the graphical execution plan.

Q: Why do queries spill to disk?

Because SQL Server didn't grant them enough memory to complete their operations. Perhaps the execution plan underestimated the amount of memory required, or perhaps the box is under memory pressure, or they're just big queries. (Remember, SQL Server uses memory for three things - caching raw data pages, caching execution plans, and workspace for queries. That workspace memory ends up being fairly small.)

Q: How can I reduce spills?

By writing sargable T-SQL statements, having up-to-date statistics, putting enough memory in the server, building the right indexes, and interpreting the execution plans when things don't work out the way you expected. Check out Grant Fritchey's book SQL Server Query Performance Tuning for detailed explanations of all of those.

SQL Server – How to Read an Execution Plan

Why don't your input parameters match the type for the table? Why would you want to keep the wrong types there and perform any casts or conversions at all (whether implicit or explicit)? Why are you converting anything to FLOAT, of all things? To address specific questions:

My query says [low] is being converted from int to numeric but that doesn't seem to be what the cluster index seek details is showing, is it?

The convert of low is happening in the output, not in the seek predicate (the predicate is what is used to find matching rows and/or eliminate non-matching rows).

Is there a way to tell in this specific example how much "Better" it would be if I did the conversions Explicitly? I would think I could just do a "cast" in the query and insert a few numbers where the variables are, is that correct?

There's no way to make the execution plan show you how much better a different plan would be, except to generate that different plan and compare. You can use this comparison to document how much better it would be if the interface were correct (and two other ways would be to keep the interface but (a) perform explicit converts in the query - not of the column, but of the variables or (b) use local variables of the right type and assign them the values of the parameters). So you could show them 3 different ways to solve the problem, and show evidence that all 3 are better than the current version.

My recommendation is to fix the procedure the right way. First let's look at the actual types you care about:

USE master;
GO
SELECT t.name, c.max_length/CASE 
  WHEN t.name LIKE N'n[cvt]%' THEN 2 ELSE 1 END
FROM sys.all_columns AS c 
INNER JOIN sys.types AS t
ON c.system_type_id = t.system_type_id
AND c.system_type_id = t.user_type_id
WHERE EXISTS
(
  SELECT 1 FROM sys.all_objects AS o
    INNER JOIN sys.schemas AS s
    ON o.[schema_id] = s.[schema_id]
    WHERE o.[object_id] = c.[object_id]
    AND o.name = N'spt_values'
    AND s.name = N'dbo'
)
AND c.name IN (N'number',N'type');

Results:

number    int     4
type      nchar   3

So the interface to your stored procedure should be:

USE yourdb;
GO
ALTER PROCEDURE dbo.some_name
  @1 INT,
  @2 NCHAR(3),
  @3 NUMERIC(4, 0)
AS
BEGIN
  SET NOCOUNT ON;

  SELECT CONVERT([float], [low] / @3, 0) -- don't think you want float here
    FROM [master].[dbo].[spt_values]
    WHERE [number] = @1
    AND [type] = @2;
END
GO

Implicit conversions between varchar and nvarchar can be particularly bad (especially in the opposite scenario as yours - parameter is nvarchar and column is varchar), but there really is no reason to allow for a 8000-character parameter of any type when the longest string possible in the table is 3 characters...

Best Answer

Related Solutions

Sql-server – Why are queries causing spill to tempdb

SQL Server – How to Read an Execution Plan

Related Question