Sql-server – Value of the Included Column is stored in Leaf Node

database-designindexnonclustered-indexsql server

There is SQL- script, which generated the Nonclustered Index with Included Column:

CREATE TABLE users 
  ( 
     id        INT, 
     firstname VARCHAR(50), 
     surname   VARCHAR(50) 
  ); 

CREATE CLUSTERED INDEX ix_users_id 
  ON users (id); 

CREATE NONCLUSTERED INDEX ix_users_firstname 
  ON users (firstname) 
  include (surname); 

SELECT firstname, 
       surname 
FROM   users 
WHERE  firstname = 'John';

If I correctly understood, most of the time, Engine of my SQL Server 2019 will seek Nonclustered Index for the above SELECT query, without touching the Clustered Index. Does that mean the value of surname column is stored in Leaf Node of the Nonclustered Index? Also, that means the value of surname is duplicated because it also stored in Clustered Index.

Am I right?

Best Answer

Does that mean the value of surname column is stored in Leaf Node of the Nonclustered Index? 

Yes, INCLUDE column values are stored at the leaf node of the index in a non-clustered index. In a clustered index, non-key columns are stored at leaf nodes in the same fashion. A heap works differently because there are no key columns.

Also, that means the value of surname is duplicated because it also stored in Clustered Index.

All non-clustered indexes store duplicate data. For example, the first name column is stored in the clustered index as well as the non-clustered index.

Without a clustered index, the data for all columns is stored in a heap and then data is duplicated in any non-clustered indexes for columns that are key columns or include columns in those indexes.

This is why when designing your indexing strategy, you need to balance performance vs storage/maintenance for the duplicated data.

Check this link for more information about SQL Server Indexes.