Sql-server – What does SQL Server use to do a count(*) on a table

heapindexsql serversql server 2014sql-server-2016

Internally, how does SQL Server determine the result for a select(*)? Is it different between a heap and a clustered index? Is it able to do smart things with the clustered index to not have to load the whole table?

Best Answer

With a trivial SELECT COUNT(*) FROM dbo.YourTable; query on a disk-based table, SQL server counts rows by scanning either the heap or leaf nodes of an index. The cost-based SQL Server optimizer chooses the narrowest one available in order to minimize the number of pages scanned.

This example shows the behavior with an index scan.

CREATE TABLE dbo.YourTable(
      Column1 int NOT NULL
    );
--load 10K rows
WITH 
     t10 AS (SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) t(n))
    ,t1k AS (SELECT 0 AS n FROM t10 AS a CROSS JOIN t10 AS b CROSS JOIN t10 AS c)
    ,t1g AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS num FROM t1k AS a CROSS JOIN t1k AS b CROSS JOIN t1k AS c)
INSERT INTO dbo.YourTable WITH (TABLOCKX) (Column1) 
SELECT num
FROM t1g
WHERE num <= 10000;
GO
CREATE INDEX ncidx_YourTable_IntColumn ON dbo.YourTable(Column1);
GO
--index scan
SELECT COUNT(*) FROM dbo.YourTable;
GO
--show heap and index size
SELECT
    i.name              AS IndexName,
    SUM(page_count * 8) AS IndexSizeKB
FROM sys.dm_db_index_physical_stats(
    db_id(), object_id('dbo.YourTable'), NULL, NULL, 'DETAILED') AS s
JOIN sys.indexes AS i
ON s.object_id = i.object_id AND s.index_id = i.index_id
GROUP BY i.name
ORDER BY i.name;

Index/heap sizes:

+---------------------------+-------------+
|         IndexName         | IndexSizeKB |
+---------------------------+-------------+
| NULL                      |         272 |
| ncidx_YourTable_IntColumn |         192 |
+---------------------------+-------------+

If we fragment the index, it becomes larger than the heap so the heap is scanned instead of the index:

--fragment index and update stats
UPDATE dbo.YourTable SET Column1 = 2;
UPDATE STATISTICS dbo.YourTable;
DBCC FREEPROCCACHE;
--heap scan
SELECT COUNT(*) FROM dbo.YourTable;
GO

Heap and index size with fragmentation:

+---------------------------+-------------+
|         IndexName         | IndexSizeKB |
+---------------------------+-------------+
| NULL                      |         272 |
| ncidx_YourTable_IntColumn |         648 |
+---------------------------+-------------+

After the index is rebuilt, the index is scanned again because it is narrower:

--rebuild index
ALTER INDEX ncidx_YourTable_IntColumn ON dbo.YourTable REBUILD;
DBCC FREEPROCCACHE;
--index scan
SELECT COUNT(*) FROM dbo.YourTable;
GO

Heap and index size after rebuild:

SELECT
    i.name              AS IndexName,
    SUM(page_count * 8) AS IndexSizeKB
FROM sys.dm_db_index_physical_stats(
    db_id(), object_id('dbo.YourTable'), NULL, NULL, 'DETAILED') AS s
JOIN sys.indexes AS i
ON s.object_id = i.object_id AND s.index_id = i.index_id
GROUP BY i.name
ORDER BY i.name;
GO

+---------------------------+-------------+
|         IndexName         | IndexSizeKB |
+---------------------------+-------------+
| NULL                      |         272 |
| ncidx_YourTable_IntColumn |         192 |
+---------------------------+-------------+