There is SQL- script, which generated the Nonclustered Index with Included Column:
CREATE TABLE users
(
id INT,
firstname VARCHAR(50),
surname VARCHAR(50)
);
CREATE CLUSTERED INDEX ix_users_id
ON users (id);
CREATE NONCLUSTERED INDEX ix_users_firstname
ON users (firstname)
include (surname);
SELECT firstname,
surname
FROM users
WHERE firstname = 'John';
If I correctly understood, most of the time, Engine of my SQL Server 2019 will seek Nonclustered Index for the above SELECT query, without touching the Clustered Index. Does that mean the value of surname
column is stored in Leaf Node of the Nonclustered Index? Also, that means the value of surname
is duplicated because it also stored in Clustered Index.
Am I right?
Best Answer
Yes, INCLUDE column values are stored at the leaf node of the index in a non-clustered index. In a clustered index, non-key columns are stored at leaf nodes in the same fashion. A heap works differently because there are no key columns.
All non-clustered indexes store duplicate data. For example, the first name column is stored in the clustered index as well as the non-clustered index.
Without a clustered index, the data for all columns is stored in a heap and then data is duplicated in any non-clustered indexes for columns that are key columns or include columns in those indexes.
This is why when designing your indexing strategy, you need to balance performance vs storage/maintenance for the duplicated data.
Check this link for more information about SQL Server Indexes.