Sql-server – Clustered indexed fields hash matching. Does it matter

sql serversql-server-2008

I'm trying to performance tune some lengthy queries and I'm slicing bits out and running them in isolation.

The query below features three tables D500M, D550M and D580M, all joined on person_ref. All of these tables have clustered indexes on the person_ref fields. They also have non clustered indexes which include the fields in the select list.

These tables are not large – just under 10k rows in each.

select p.PERSON_REF,p.surname,p.first_forname,e.employee_number,ph.MAIN_FLAG
from D500M p
INNER join D550M e on e.PERSON_REF=p.PERSON_REF
INNER join D580M ph on ph.PERSON_REF=p.PERSON_REF

I'm studying the Actual Execution Plan and seeing Index scans on the three tables (nonclustered), and then two Hash Matches on the joins, with costs of 39% and 41%.

The question is this; Is having these hash matches good, bad or irrelevant?

Best Answer

SQL Server is very good at deciding the best, most efficient execution plan for any given query. Since you have no predicates, such as a WHERE D500M.PERSON_REF = 123456, SQL Server is most likely assuming an index scan is the quickest way to gather all the relevant details from each table. If you add a predicate to the query, you will most likely see an index seek instead. This is to be expected.

"Good, bad or irrelevant" is only something you can decide. Is the performance acceptable for your needs?

You could try forcing the join type to see what happens if a loop join or a merge join were used instead. Something like:

SELECT p.PERSON_REF, p.surname, p.first_forname, e.employee_number, ph.MAIN_FLAG
FROM D500M p
INNER LOOP JOIN D550M e ON e.PERSON_REF=p.PERSON_REF
INNER LOOP JOIN D580M ph ON ph.PERSON_REF=p.PERSON_REF

If you want a more conclusive answer, you should provide code for a minimally complete verifiable environment that matches your environment. i.e. provide us table definitions, the actual queries involved, and the actual query plans you're seeing.