Sql-server – SQL Server – Varbinary Column – Extremely Slow Statistics Update

sql-server-2008-r2statisticsvarbinary

I have a table in my database that we use as a filestore, the file itself is stored in a varbinary column, which all seemed to work well until recently, we noticed one of our instances of this table had essentially "jammed" on an insert statement.

Checking sys.dm_os_waiting_tasks showed that the insert statement had triggered a statistics update, and that this statistics update was taking a very long time. (17 minutes).

Here's the statement we found running:

SELECT StatMan([SC0], [LC0]) FROM 
   (SELECT TOP 100 PERCENT CONVERT([varbinary](200), 
        SUBSTRING ([Data], 1, 100)++substring([Data], 
        case when LEN([Data])<=200 then 101 else 
        LEN([Data])-99 end, 100)) AS [SC0], 
        datalength([Data]) AS [LC0] 
    FROM [FileSystem].[FileData] WITH 
    (READUNCOMMITTED) ORDER BY [SC0] ) AS _MS_UPDSTATS_TBL

There are roughly 2000 rows in this table, here's what it looks like:

CREATE TABLE [FileSystem].[FileData]
(
[Id] [uniqueidentifier] NOT NULL CONSTRAINT [DF__FileData__Id__09DE7BCC] DEFAULT     (newsequentialid()),
[Data] [varbinary] (max) NULL,
[FileHash] [nvarchar] (4000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[FileSize] [bigint] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [FileSystem].[FileData] ADD CONSTRAINT [PK_FileData] PRIMARY KEY     CLUSTERED  ([Id]) WITH (STATISTICS_NORECOMPUTE=ON) ON [PRIMARY]
GO

We're aware this table is pretty weird, and we are careful to only every address it with a clustered index seek.

For the moment, we've simply disabled automatic statistic generation for this table, but I'm wondering if that's really best practice. Will performance eventually become a problem without up-to-date statistics (bear in mind, this table is only every addressed by it's clustered index)?

Update:
Okay I'm pretty sure we've worked out what was causing the statistic to be generated:

GO CREATE PROCEDURE [FileSystem].[FileData_AppendNewData_Easy] 
    (@fileDataId  uniqueidentifier ) 
WITH EXECUTE AS CALLER AS 
BEGIN 
   declare @testValue varbinary(max); 

   set @testValue = 0xabcedf012439; 

   Update FileSystem.FileData 
   set Data.Write(@testValue, null, null) 
   where Id = @fileDataId ;
END

This is a simplified version of the procedure that was causing the issue, it seems Data.Write causes some kind of implicit predicate on the data column? Well that solves that mystery, although I'm still not sure what impact disabling the statistics for this table might have, can anyone comment on that?

Best Answer

This can be addressed in two additional ways:

  1. Enable Asynchronous Auto Update Statistics and the database level (not my preference). If your production is stuck because due to this issue then this is the quick fix as it is instantaneous but it globally affects the database and can produce sub-par query plans since the plan is generated prior to the stats update.

    ALTER DATABASE [DatabaseName] SET AUTO_UPDATE_STATISTICS_ASYNC ON WITH NO_WAIT
    
  2. Remove the auto generate stat and replace it with a manually created one with no-recompute. The downside is that, while this is a surgical approach, it takes some time to create the statistic and will cause blocking.

    DROP STATISTICS [dbo].[TableName].[_WA_Sys_00000004_7D78A4E7]
    GO
    CREATE STATISTICS [_Manual_Stat_Name] ON [dbo].[TableName]([Column]) WITH NORECOMPUTE;
    GO
    

When we were forced to deal with a similar issue we went with option 1 first and later option 2. During a "green zone" we reverted the database wide setting and then replaced the column's statistic with the manually created NORECOMPUTE one.