Sql-server – Benefits of not having a clustered index on tables (heaps)

clustered-indexheapsql server

What are the benefits of not having a clustered index on a table in SQL Server?

Will:

SELECT * INTO TABLE_A FROM TABLE_B

Be faster if TABLE_A is a heap?

Which operation(s) will benefit if the table is a heap?

I am quite sure UPDATEs and DELETEs will benefit from a clustered index. What about INSERTs? My understanding is that INSERT "might" benefit from the table being a heap, both in term of speed but also other resources and hardware (I/O, CPU, memory and storage…).

What is the most scarce resource in terms of hardware? In terms of storage is a heap going to occupy less space? Is disk storage not the least expensive resource? If so is it rational to keep table as heap in order to save disk space? How will a heap affect CPU and I/O with SELECT, INSERT, UPDATE and DELETE? What cost goes up when table is a heap and we SELECT, UPDATE and DELETE from it?

Best Answer

I'm only addressing the question in the title. The question body has too many different questions to be answered in a stack exchange answer. Heaps have the following advantages over tables with clustered indexes:

  1. Tables with clustered indexes physically organize the data in order of those indexes. The SQL Server query optimizer may add a sort to a query plan for large enough inserts if the data to be inserted isn't already sorted in clustered key order. The additional work performed by the sort is not free and the query can run slower as a result. Inserts into non-partitioned heaps do not need sorted source data.

  2. The rules for minimal logging work differently between heaps and clustered indexes. In some scenarios, an insert into a heap will write fewer bytes to the transaction log than a comparable insert into a table with a clustered index. Reference the Data Loading Performance Guide for some examples.

  3. A clustered index prevents parallel insert into a table. The insert part of the query plan will always be located in a serial zone (DOP = 1). Starting with SQL Server 2016, an insert into a heap with no nonclustered indexes can qualify for parallel insert.

  4. A clustered index will add lock and latch contention when many separate processes are inserting into the same table. Heaps with no nonclustered indexes can be loaded by many processes concurrently with less contention, especially when those inserts are sourced outside of SQL Server with Bulk Update locks.