Sql-server – UPDATE Statistics with full scan

sql serversql-server-2005sql-server-2008

Does running UPDATE Statistics on SQL 2008 r2 with full scan on a large table 50 million rows (non partitioned)? I am trying to confirm the following behavior.

a) takes any locks during the entire run? Not sure but I think no locks are taken however a sr. dba told me that it takes schema modify lock (hmm…)

b) cause blocking — I don't think so

c) cause heavy IO — as each and every page will be read into the memory and pollute the data page buffer cache. So generally you should do this off hours.

Best Answer

a) Pretty much anything takes a schema stability lock. You don't want something else changing the structure of the table while you are updating your statistics. According to this, update statistics takes schema stability and modification locks.

b) If something tries to change the table's structure, it will be blocked. IIRC, update stats does dirty reads, so it shouldn't block connections that are merely reading or writing.

c) If you use FULLSCAN, it will read the entire table because that is what you told it to do. I don't see how that can be seen as anything but 'causing heavy i/o'. Normally the default of 'sampling' works well enough, but I have seen it cause problems with data with non-homogenous distributions. Often, it's also easier to just reindex the whole table (especially if you can do it online) because reindexing is parallelizable where as update statistics isn't. (AFAIK, MS did not fix that in sql 2008.)

Related Solutions

Sql-server – Should I lock the table during delete

Changing the locking to table locks will just make the deletes run even slower as the delete won't be able to run until the lock can be taken on the table which means that all other threads need to be finished or blocked. If you have foreign keys with delete cascade enabled that will probably take a lot of the time.

You might want to change it to a SQL Agent job so instead of running your app which connects and disconnects, you just run a loop deleting data until you are done.

SELECT NULL --Makes the WHILE loop work.
WHILE @@ROWCOUNT <> 0
BEGIN
    DELETE TOP (1000) FROM YourTable
    WHERE Column < getdate()-31
END

If this doesn't work you could look into table partitioning which would allow you to switch the data to another table very quickly, then truncate the data from the new table. This however does require Enterprise Edition.

How to Get Automatically Notified of Prolonged Blocking in SQL Server

Have you considered using snapshot isolation? Enabling read_committed_snapshot in the database will cause all reads (selects) to be lock free:

alter database [...] set read_committed_snapshot on;

No application changes. Some semantics change under snapshot and your application may react weirdly, but that is the exception not the norm. The vast majority of applications don't notice any difference, they just get a free performance boost.

Anyway, I though to answer also the original question: how to detect (and possibly kill) a long running query. Actually, the engine already does that for you. There is an event raised when a threshold is passed: Blocked Process Report Event Class. The threshold is configured via the blocked process threshold Option. Any trace event can be turned into an Event Notification and event notifications can activate procedures. Connect the dots and you have on-demand activated code that runs when the engine detects a query that has crossed an execution time threshold. No polling, no monitoring. Note though that the notification is asynchronous, by the time you process it the query may had completed so that has to be taken into account.

Here is an example:

use msdb;
go

create queue blocked_process_report_queue;
go

create service blocked_process_report_service
    on queue blocked_process_report_queue
    ([http://schemas.microsoft.com/SQL/Notifications/PostEventNotification]);

create event notification blocked_process_report_notification
    on server
    for BLOCKED_PROCESS_REPORT
    to service N'blocked_process_report_service',
          N'current database';
go  

sp_configure 'show advanced options', 1 ;
GO
RECONFIGURE ;
GO
sp_configure 'blocked process threshold', 20 ;
GO
RECONFIGURE ;

Now in a new query set up a WAITFOR expecting a notification:

use msdb;
waitfor(
   receive cast(message_body as xml), * 
   from  blocked_process_report_queue);

And go ahead and cause some blockage. I used a process that created a table and didn't commit, and from another query windows I tried to select from the table. In 20 seconds (my configured threshold above) I got the blocking report:

<blocked-process-report>
  <blocked-process>
    <process id="process1b5a24ca8" ...>
      <executionStack>
        <frame line="1" stmtstart="-1"... />
      </executionStack>
      <inputbuf>
          select * from t   </inputbuf>
    </process>
  </blocked-process>
  <blocking-process>
    <process status="sleeping" spid="51" ...>
      <executionStack />
      <inputbuf>
         begin transaction
         create table t (a int)   </inputbuf>
    </process>
  </blocking-process>
</blocked-process-report>

I'll leave the task of wrapping this up into an automated process as an exercise to the reader. And yes, the queue/service/activated procedure must be in [msdb].

Best Answer

Related Solutions

Sql-server – Should I lock the table during delete

How to Get Automatically Notified of Prolonged Blocking in SQL Server

Related Question