Sql-server – Reading 300% rows – Problematic execution plan

execution-plansql serverssisssmst-sql

Whole queryplan and query: https://www.brentozar.com/pastetheplan/?id=BkgbANxN4

I am struggling with a query that uses a lot of time.
The obvious reason for this is wrong estimates of amount of rows.
I have tried indexing and updated stats, however I do not seem to address the real problem since the query still goes amazingly slow (I guess wrong index or stats).
Part of execution plan that are problematic

What I need are help with interpreting this information so I can help the query to be optimized. What index, which DBCC functions or other built in functions should I run between executions to ensure clean cache and whatnot to avoid using wrong stats and the fresh new one?

Examples would be

DBCC FREEPROCCACHE
UPDATE STATISTICS *table*

Description of the index seek

Another bottleneck of the same query would be these hash matches. The Hash Match at the right most are finished within seconds, while the one to the left seems to struggle with the last 213 rows that uses many minutes. What can I do to find out where the problem lies in these hash matches?

another part of the execution plan

I am also trying to solve multiple memory spills in our batch jobs, where I can only seem to optimize memory spill from a single table.

I have multiple sorts and hash matches where there are rather long probes and residuals including multiple tables or "Expressions" which I assume are either aggregated or set by SSIS packages.

Should I solve the 'first' or 'last' spills first? First are 'top of the three' and last would be closest to leaf nodes (operators). I do also wonder about some operators, mentioned below.

Can you explain what the terms mean regarding the execution plan descriptions:

  1. Build residual
  2. Probe residual
  3. Hash keys probe

I believe I understand the following terms:

Order by: The order of which the operator needs the data in

Output list: Which data the operator are retrieving for output

Best Answer

That’s a lot of questions.

I write about Hash Match with Probe Residual here: http://blogs.lobsterpot.com.au/2011/03/22/probe-residual-when-you-have-a-hash-match-a-hidden-cost-in-execution-plans/ - there’s a good chance the problem with your Hash Match finishing is more related to what’s pulling the rows from it than the Hash itself.

As for your stats, the problem is all your ORs. You may find your Seek works better with an index that has KVID_Kontotype as the first key, followed by PeriodeStartDato. Then instead of having a large range scan across all dates earlier than your predicate and checking every single row to see if the Kontotype is correct, it would seek for each Kontotype with the appropriate date range. It would probably estimate better, but fundamentally would read fewer rows to get what you need.