Sql-server – Two columns with and without index in where clause – How index works

execution-planindexsql serversql-server-2008

I am executing a query on SQL Server 2008:

SELECT col1
FROM table1
WHERE col2=val2 AND col3=val3

Here col2 has a non-clustered index, col1 is the PRIMARY KEY, and col3 does not have any index. Query execution plan is similar to this.
enter image description here

I want to know how the query execution works here. From the execution plan I can see an index seek on 'col2' and Key-Lookup on 'col3' (shown parallelly).

  • Is it going to fetch all the rows that matches the 'col2=val2' condition and check for the other condition?
  • Why these two -Index seek and Key-Lookup, are shown parallel in the execution plan?
  • Will it always use the available index on 'col2', considering a large dataset and assuming almost all entries in 'col2' are unique?

Best Answer

I want to know how the query execution works here

The general execution model is a pipeline, where each iterator returns a row at a time. Execution starts at the root iterator (on the far left, labelled SELECT in your example).

After initialization, the root iterator requests a row from its immediate child, and so on down the chain until an iterator that can return a row is found. This passes back up the chain to the root where it is queued for despatch to the client. That is a very simplified overview, for more details see:

Is it going to fetch all the rows that matches the 'col2=val2' condition and check for the other condition?

The nonclustered index seek will locate a row that matches col2=val2. It will be able to return col2 and col1 (see its output list) because col1 is present in the index (since the primary key is clustered in this case).

This row is passed up to the nested loops join, which then passes control to the key lookup. The lookup uses the col1 value to seek into the clustered index b-tree to find the value of col3 in this row. The value is tested against the predicate col3=val3 and only returned if it matches.

If there is a match, the row (c1, c2, c3) is passed up the chain and queued for transmission to the client. As control descends the tree again, any new match for col2 in the nonclustered index will result in a repeat of the nested loops join -> lookup -> return row cycle. As soon as the nonclustered index seek runs out of rows, the process completes when control next returns to the root iterator.

Why these two -Index seek and Key-Lookup, are shown parallel in the execution plan?

That's just the way the graphical plan is laid out. See the links and discussion above for the correct way to understand the execution process in detail.

Will it always use the available index on 'col2', considering a large dataset and assuming almost all entries in 'col2' are unique?

Most likely yes. The optimizer makes a cost-based choice between the available strategies. With very few matches expected, the optimizer will usually assess a nonclustered index seek with a lookup as being the cheapest option. An exception occurs when the table is very small. In that case, scanning the clustered index and applying both predicates to each row will likely look cheapest.

As a final note, a covering nonclustered index that would avoid the lookup is:

CREATE [UNIQUE] NONCLUSTERED INDEX [dbo.table1 col2,col3 (col1)]
ON dbo.table1 (col2, col3)
INCLUDE (col1);

It should be specified UNIQUE if that is true for (col2, col3).