Sql-server – Is the default clustered index file physically the same file as the file containing the data records

clustered-indexsql server

I have read that the records in a sorted data file (for a DB table) are sorted in the same order as the keys in its corresponding clustered index — and a leaf node of clustered index contains the actual data record corresponding to the key in the leaf node.

Does that mean that they are physically the same on-disk file ?

If yes , then is the clustered index's tree data structure actually physically part of the data file ?

If no, and they are physically separate on-disk files, then does the leaf node contain a pointer to the data record ? — but that's what happens in a non-clustered index.
I can't find an explicit answer to this query. Kindly help.

Best Answer

You refer to "data file" as like the data (actual rows) for a table is in a file separate from other tables and indexes. That is not the case. I just like to point that out, since that might be a contributing reason for the confusion.

There is so much info available out there on physical structures for SQL server, so this will be super-brief:

Data (loosely speaking) is stored in pages (8KB) blocks, in a database data file. There are different type of pages.

The clustered index is the data. I.e., the leaf level of that index are the actual rows. It contains all columns, and the sort for that index is of course based on the index key for that clustered index. There's nowhere else/further to "point". When you navigated the tree, and are now in the leaf for the cl ix, you are indeed at the data row.

Also note that a data file (.mdf, .ndf) for a database (in contrast to a transaction log file, .ldf) can contain all types of pages. That include pages for non-clustered index, pages for clustered indexes, pages for heaps (data pages for tables that don't have a clustered index), etc.

So, at the file level, there is no separation between "data" and "indexes". You can achieve that, to some degree, using filegroups, but that is a separate topic.

Also note that I carefully didn't confuse the clustered index with the primary key. It just happens that the default index type for the PK is a clustered index. But that is just a default.