Sql-server – Table statistic vs estimated rows query plan

execution-plansql serverstatistics

Can anyone explain why despite having inaccurate table statistics, the actual query plan "estimated number of rows" is correct? E.g table has 30 rows but all table statistics show 50 rows. However a query on the table produces a plan that apparently correctly estimates 30 rows. How did it know there are actually 30 rows?

CREATE TABLE dbo.TestPerf
  (
     ID   INT IDENTITY(1, 1),
     Col1 VARCHAR(100)
  )
GO

INSERT TestPerf
SELECT 'badger'

GO 1000

CREATE STATISTICS stat_TestPerf_Col1 ON dbo.TestPerf (Col1) WITH FULLSCAN;

--lose some rows 
DELETE dbo.TestPerf
WHERE  ID > 750 --Check the stat not updated - still shows 1000 rows 
DBCC SHOW_STATISTICS (N'dbo.TestPerf', stat_TestPerf_Col1) --now run a query and see the estimatedRows 
SET statistics profile ON

SELECT *
FROM   dbo.TestPerf

SET statistics profile OFF --estimatedRows = 750

Best Answer

You are checking the statistics for changes in the wrong place in your code. They won't be auto updated immediately after the delete. This will happen when a query is executed or compiled that needs them (and in fact your code contains no query that needs them anyway apart from the delete itself).

But SQL Server does maintain cardinality information for the partitions of the table/indexes as a whole after every insert/delete etc.

You can see this with the below (only returns 1 row as the table is not partitioned and has no indexes)

SELECT row_count
FROM   sys.dm_db_partition_stats
WHERE  object_id = object_id('dbo.TestPerf'); 

This shows it that the current row count is 750. The query

SELECT *
FROM   dbo.TestPerf

Doesn't benefit from column statistics. It can just use that directly and even if statistics are disabled will be accurate.

A query with WHERE ID > 0 would use the statistics and return the whole table. As it happens the statistics don't get updated in your case anyway when such a query is executed as you haven't hit the threshold for modifications for an auto update to occur. But the estimates are still accurate.

The statistics records that there were 1,000 rows at sampling time.

enter image description here

So it would be simple to just scale the estimates down to adjust for the new table cardinality by multiplying by 0.75.

The scaling down works fine for WHERE Id>0 as 1,000 * 0.75 gives the correct result but if you try querying the 250 rows that you just deleted this will not show correct estimates.

enter image description here

The statistics indicate that it will return 250 rows, multiplying that by 0.75 gives 187.5.