Sql-server – Execution plan – Hash Match

execution-planperformancequery-performancesql-server-2008

SELECT        dbo.Contracts.ContractID AS [Service Unit ID], dbo.Jobs.JobName, dbo.Customers.CustomerName, dbo.Contracts.ContractAutoID, dbo.Contracts.ContractType, dbo.Contracts.ContractStatus AS [Unit Status], 
                         dbo.Contracts.Area
FROM            dbo.Customers WITH (READPAST) INNER JOIN
                         dbo.Jobs WITH (READPAST) ON dbo.Customers.CustomerID = dbo.Jobs.CustomerID INNER JOIN
                         dbo.Contracts WITH (READPAST) ON dbo.Jobs.JobID = dbo.Contracts.JobID
WHERE        (dbo.Contracts.ContractType = 'service')

I understand Hash Match indicates the indexing is not appropriate, which is why it takes up so much space. I am wondering what i need to different with my indexing to get these Hash Matches to become nested loops. Below is a picture of the Execution Plans.

Customer Index

Index Key Column: CustomerID

Included Columns: City, CustomerStatus, CustomerName

Contracts Index

Index Key Column: ContractType

Included Columns: ContractAutoID, JobID, Area, ContractStatus,
ContractID

Jobs Index

Index Key Column: JobID

Included Columns: CustomerID, JobName, Area

Best Answer

You could use INNER LOOP JOIN to cause a nested loop, but it may not improve the performance. Or use an OPTION (LOOP JOIN) hint to still allow the optimiser to decide join order (which is forced by join hints).

Related Solutions

Sql-server – Execution Plan Basics — Hash Match Confusion

As SQLRockstar's answer quotes

best for large, unsorted inputs.

Now,

from the Users.DisplayName index scan (assumed nonclustered) you get Users.Id (assuming clustered) = unsorted
You are also scanning Posts for OwnerUserId = unsorted

This is 2 unordered inputs.

I'd consider an index on the Posts table on OwnerUserId, including Title. This will add some order on one side of the input to the JOIN + it will be covering index

CREATE INDEX IX_OwnerUserId ON Posts (OwnerUserId) INCLUDE (Title)

You may then find that the Users.DisplayName index won't be used and it will scan the PK instead.

Sql-server – Optimising join on large table

Your ix_hugetable looks quite useless because:

it is the clustered index (PK)
the INCLUDE makes no difference because a clustered index INCLUDEs all non-key columns (non-key values at lowest leaf = INCLUDEd = what a clustered index is)

In addition: - added or fk should be first - ID is first = not much use

Try changing the clustered key to (added, fk, id) and drop ix_hugetable. You've already tried (fk, added, id). If nothing else, you'll save a lot of disk space and index maintenance

Another option might be to try the FORCE ORDER hint with table order boh ways and no JOIN/INDEX hints. I try not to use JOIN/INDEX hints personally because you remove options for the optimiser. Many years ago I was told (seminar with a SQL Guru) that FORCE ORDER hint can help when you have huge table JOIN small table: YMMV 7 years later...

Oh, and let us know where the DBA lives so we can arrange for some percussion adjustment

Edit, after 02 Jun update

The 4th column is not part of the non-clustered index so it uses the clustered index.

Try changing the NC index to INCLUDE the value column so it doesn't have to access the value column for the clustered index

create nonclustered index ix_hugetable on dbo.hugetable (
    fk asc, added asc
) include(value)

Note: If value is not nullable then it is the same as COUNT(*) semantically. But for SUM it need the actual value, not existence.

As an example, if you change COUNT(value) to COUNT(DISTINCT value) without changing the index it should break the query again because it has to process value as a value, not as existence.

The query needs 3 columns: added, fk, value. The first 2 are filtered/joined so are key columns. value is just used so can be included. Classic use of a covering index.

Best Answer

Related Solutions

Sql-server – Execution Plan Basics — Hash Match Confusion

Sql-server – Optimising join on large table

Related Question