I have a table in my database that we use as a filestore, the file itself is stored in a varbinary
column, which all seemed to work well until recently, we noticed one of our instances of this table had essentially "jammed" on an insert statement.
Checking sys.dm_os_waiting_tasks
showed that the insert statement had triggered a statistics update, and that this statistics update was taking a very long time. (17 minutes).
Here's the statement we found running:
SELECT StatMan([SC0], [LC0]) FROM
(SELECT TOP 100 PERCENT CONVERT([varbinary](200),
SUBSTRING ([Data], 1, 100)++substring([Data],
case when LEN([Data])<=200 then 101 else
LEN([Data])-99 end, 100)) AS [SC0],
datalength([Data]) AS [LC0]
FROM [FileSystem].[FileData] WITH
(READUNCOMMITTED) ORDER BY [SC0] ) AS _MS_UPDSTATS_TBL
There are roughly 2000 rows in this table, here's what it looks like:
CREATE TABLE [FileSystem].[FileData]
(
[Id] [uniqueidentifier] NOT NULL CONSTRAINT [DF__FileData__Id__09DE7BCC] DEFAULT (newsequentialid()),
[Data] [varbinary] (max) NULL,
[FileHash] [nvarchar] (4000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[FileSize] [bigint] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [FileSystem].[FileData] ADD CONSTRAINT [PK_FileData] PRIMARY KEY CLUSTERED ([Id]) WITH (STATISTICS_NORECOMPUTE=ON) ON [PRIMARY]
GO
We're aware this table is pretty weird, and we are careful to only every address it with a clustered index seek.
For the moment, we've simply disabled automatic statistic generation for this table, but I'm wondering if that's really best practice. Will performance eventually become a problem without up-to-date statistics (bear in mind, this table is only every addressed by it's clustered index)?
Update:
Okay I'm pretty sure we've worked out what was causing the statistic to be generated:
GO CREATE PROCEDURE [FileSystem].[FileData_AppendNewData_Easy]
(@fileDataId uniqueidentifier )
WITH EXECUTE AS CALLER AS
BEGIN
declare @testValue varbinary(max);
set @testValue = 0xabcedf012439;
Update FileSystem.FileData
set Data.Write(@testValue, null, null)
where Id = @fileDataId ;
END
This is a simplified version of the procedure that was causing the issue, it seems Data.Write
causes some kind of implicit predicate on the data column? Well that solves that mystery, although I'm still not sure what impact disabling the statistics for this table might have, can anyone comment on that?
Best Answer
This can be addressed in two additional ways:
Enable Asynchronous Auto Update Statistics and the database level (not my preference). If your production is stuck because due to this issue then this is the quick fix as it is instantaneous but it globally affects the database and can produce sub-par query plans since the plan is generated prior to the stats update.
Remove the auto generate stat and replace it with a manually created one with no-recompute. The downside is that, while this is a surgical approach, it takes some time to create the statistic and will cause blocking.
When we were forced to deal with a similar issue we went with option 1 first and later option 2. During a "green zone" we reverted the database wide setting and then replaced the column's statistic with the manually created NORECOMPUTE one.