Sql-server – Force Sql server to use index seek in merge join

execution-planoptimizationsql serversql-server-2012

I am testing Merge join in sql server. I have an INNER JOIN and force the optimizer to do MERGE JOIN:

ID in Personal Table is a Primary key
ID in Abteilung Table is a primary key

There is no other Indexes in these tables

 select * from  [dbo].[Personal] as P inner join [dbo].[Abteilung] as A  
 on  P.[ID]= A.[Personal_ID]   OPTION (MERGE JOIN)

then Optimizer uses this plan to run my query:

I would like to force optimizer to use index seek instead of Table Scan

What I did to achieve this aim:

Defining a grouped index in the Table Abteilung on columns ID, Personal_ID. If I run the query, the execution plan is changes like this:

here I have Clustered index scan, but not clustered index seek

Using FORCESEEK hint. If I am using this hint, an error occures by running the query:

Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any
hints and without using SET FORCEPLAN.

If I add a where clause in the query, then the query would be change to:

  select * from  [dbo].[Personal] as P inner join [dbo].[Abteilung] as A  
  on P.[ID]= A.[Personal_ID] where A.[Personal_ID]=2 
  OPTION (MERGE    JOIN)

in this situation I have only Clustered index scan.

My question:

How do I change my query to have Clustered Index Seek in the bottom line?

Best Answer

To get a clustered index seek, you'd need a clustered index that supports your filter (e.g. leading key would have to be Personal_ID, not ID).

You can't force a seek if there's no index with a leading column of Personal_ID that supports the filter.

This does not mean you should change the existing clustered index, unless this is the only query you ever run against the table.

And while you could create a non-clustered index with Personal_ID as the key column, a seek on that index might not be what you want anyway - since you're using SELECT * (are you really sure you need all columns from both tables?), it will need to go fetch the rest of the columns from the clustered index anyway, and if there are more than _{some number of rows} returned, at some point a seek (well, what would amount to a range scan disguised as a seek) + lookups will be more expensive than a regular scan.

Why do you think you need a seek here? How many rows does the query return, how wide are they, and how long does it take? Is this just educational, or are you under the assumption that a seek will always perform better than a scan? (It won't, btw.)

Some useful reading:

Best Answer

Related Solutions

Sql-server – Filtered index not chosen, and rejected with hint

Sql-server – Why does referencing a variable in a join predicate force nested loops

Related Question