Sql-server – Unique filtered index and surprising estimation

execution-plansql server

When I view the estimated execution plan for this query…

SELECT *
FROM tabela2  x
     LEFT JOIN dbo.tabela1 t with (nolock) on t.Kod = x.Kod

… the Estimated Number of Rows property for [table1] shows 10. Why is this? In the schema (copied below) I have a unique index AKF_tabela1_Kod. Shouldn't this unique index force the row estimation to be 1?

Estimated Execution Plan

query plan

Create Database, Tables and Index

-- create database 
USE [master]
GO
DROP DATABASE IF EXISTS my_test;
CREATE DATABASE my_test;
GO
ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS OFF;
ALTER AUTHORIZATION ON DATABASE ::my_test TO sa;
USE my_test
GO

-- create dbo.tabela1

CREATE TABLE dbo.[tabela1] 
    (
    [Id] int NOT NULL identity(1,1),
    [kol1] varchar(20) NOT NULL,
    [Kod] int NULL,  
    CONSTRAINT [PK_tabela1] PRIMARY KEY NONCLUSTERED([id] ASC) WITH (ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON),
    CONSTRAINT [BK_tabela1] UNIQUE CLUSTERED([Kol1] ASC) WITH (ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON),
    );
GO

-- create Index 

CREATE UNIQUE NONCLUSTERED INDEX [AKF_tabela1_Kod] 
    ON dbo.[tabela1]([Kod] ASC) 
    INCLUDE ([id],[kol1]) 
    WHERE ([Kod] IS NOT NULL) 
    WITH (ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, MAXDOP = 0);
GO

-- create dbo.tabela2

IF  OBJECT_ID('dbo.tabela2') IS NULL
  CREATE TABLE tabela2 (Kod int NOT NULL)
GO

Insert Data into Table

-- create data in tabela1
declare @i int=1
    while @i<=100
    begin
      INSERT tabela1 (kol1, kod)
      values (@i, cast(@i+100 as varchar))
      set @i += 1;
    end;
GO

UPDATE STATISTICS tabela1 WITH FULLSCAN;

Here the Execution plan

Best Answer

the Estimated Number of Rows property for [table1] shows 10. Why is this? In the schema (copied below) I have a unique index AKF_tabela1_Kod. Shouldn't this unique index force the row estimation to be 1?

The statistics are created when creating the unique index, but as a result of the unique index being filtered, the statistics are also filtered. (where kod is not null).

As a result, the query as it is cannot use these statistics.

Why?

The Index can be used, but the statistics are not a match for the query.

Even if you do a select on kod:

 select kod from dbo.tabela1 t 
     where kod is not null  option(recompile)

The statistics are still no match: enter image description here

If you do a select of the id however:

 select id from dbo.tabela1 t 
 where kod is not null  option(recompile)

There is a match:

enter image description here

This seems to be buggy with auto create stats:

ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS ON;

 select kod from dbo.tabela1 t 
 where kod is not null  option(recompile)

This does not create stats for some reason, but is still able to do the correct estimation. enter image description here

No new stats:

enter image description here

Re running this:

 ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS OFF;

 select kod from dbo.tabela1 t 
 where kod is not null  option(recompile)

Gives the same warning: enter image description here

This seems to be an issue with auto create statistics being off, even though no stats are created.

I am not sure, but an estimate of 10 might be a 'default' value to use, like when using a table variable.

Workarounds:

1

If you create the next index, the result will be 1 estimated rows

CREATE UNIQUE NONCLUSTERED INDEX [AKF_tabela1_Kod2] 
    ON dbo.[tabela1]([Kod] ASC) 
    INCLUDE ([id]) 
    WITH (ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, MAXDOP = 0);
GO

2

Another route you could take is creating your own statistics:

 create statistics ST_Tablea1_Kod on dbo.tabela1(kod)

Retry the query (with a new execution plan):

SELECT *
FROM tabela2  x
     LEFT JOIN dbo.tabela1 t with (nolock) on  t.Kod = x.Kod option(recompile)

enter image description here enter image description here

3

Or ofcourse, enable auto create statistics and rerun the query:

ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS ON;

Update:

The below queries give a stat missing on the first execution, but not on the other two. A statistic was created by auto create stats.

     SELECT *
FROM tabela2  x
     LEFT JOIN dbo.tabela1 t with (nolock) on t.Kod = x.Kod option(recompile)
      ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS ON;
         SELECT *
FROM tabela2  x
     LEFT JOIN dbo.tabela1 t with (nolock) on t.Kod = x.Kod option(recompile)
           ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS OFF;

             SELECT *
FROM tabela2  x
     LEFT JOIN dbo.tabela1 t with (nolock) on t.Kod = x.Kod option(recompile)

enter image description here

Dropping the stats and retrying with a select:

     drop statistics  dbo.tabela1.[_WA_Sys_00000003_0EA330E9]

 select kod from dbo.tabela1 t 
 where kod is not null  option(recompile)

     ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS ON;

 select kod from dbo.tabela1 t 
 where kod is not null  option(recompile)

     ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS OFF;

 select kod from dbo.tabela1 t 
 where kod is not null  option(recompile)

Does not generate stats, and gives the same missing index hints on the first and third execution:

enter image description here

I thought that this maybe was because of the optimization level being trivial. But this was not the case.

StatementOptmLevel="TRIVIAL"

Non trivial optimization level:

 select kod from dbo.tabela1 t 
 where kod is not null 
 group by kod  option(recompile)

     ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS ON;

 select kod from dbo.tabela1 t 
 where kod is not null     group by kod   option(recompile)

     ALTER DATABASE my_test SET AUTO_CREATE_STATISTICS OFF;

 select kod from dbo.tabela1 t 
 where kod is not null    group by kod   option(recompile)

Result: enter image description here

Query #2 with automatic stat creation: enter image description here (I added 100 rows for a total of 200)

Query #3 , after putting automatic stat creation off again: enter image description here (180 estimated rows)

As a conclusion, this should have something to do with automatic stat creation being disabled. My guess is on some sort of temporary stats or the way the plan is created. I checked if i could find any 'temp'stats that are created at runtime, but i did not find anything. (select * from sys.stats where is_temporary = 1)