Sql-server – How to reproduce “Could not continue scan with NOLOCK due to data movement”

nolocksql serversql-server-2008-r2

I occasionally get "Could not continue scan with NOLOCK due to data movement" with some large jobs, which do have WITH (NOLOCK) on the select queries.

I understand this has something to do with attempting to select data when there has been a page split which caused the data to no longer be where it was supposed to be – I assume that's what's happening in my environment.

How would I reproduce this?

I'm trying to do a short-term workaround to catch the error and retry when this happens, but I can't test it if I can't reproduce it. Is there a reasonably reliable way to cause this?

When it does happen, executing the query again results in success – so I don't really have any concerns about the actual data or database being permanently corrupt. Some of the tables in the query (along with their indexes) get dropped, recreated, and repopulated often, so I'm assuming it's something related to that.

Removing NOLOCK is my long-term issue to deal with. The reason NOLOCK was put there in the first place was that the queries are so bad that they were deadlocking with day-to-day transactions, so NOLOCK was a band-aid to stop the deadlocks (which worked). So I need a band-aid on a band-aid until we can do a permanent solution.

If I could reproduce it with a Hello World, I would plan on probably slapping the band-aid into the job in less than an hour. Can't do a search-and-replace removing NOLOCK, because I'd start getting the app deadlocks again, which are worse for me than an occasional failing job.

Using read committed snapshot isolation is a good possibility – I'll have to work with our database team to get more details about that. Part of our problem is that we don't have a SQL Server expert to deal with that kind of thing, and I don't understand the isolation levels well enough to make that change right now.

Best Answer

Since one potential 'band aid' to the NOLOCK issues is to stop using NOLOCK and begin using READ_COMMITTED_SNAPSHOT isolation, I want to point you to the blog post at http://www.brentozar.com by Kendra Little: Implementing Snapshot or Read Committed Snapshot Isolation in SQL Server: A Guide.

Kendra provides a fair amount of detail on benefits and risks with using the READ_COMMITTED_SNAPSHOT isolation level.

  1. This isolation level becomes the default isolation level for the database code.
  2. You have to have only one user in the database in order to make the change to the READ_COMMITTED_SNAPSHOT isolation level.
  3. Even if you use READ_COMMITTED_SNAPSHOT isolation you will still need to remove the NOLOCK hints since they override the default.
  4. Some of your code may well have problems that need curing.

Some years ago we implemented READ_COMMITTED_SNAPSHOT isolation on a database that was severely suffering from blocking. But once we changed the isolation level we began getting deadlocks in a couple of critical areas.

Why did this happen? Because the previous isolation level caused heavy blocking, the code could "never" reach the point of deadlocking. However, with READ_COMMITTED_SNAPSHOT isolation, the queries could keep moving forward. However, some percentage of the no-longer waiting transactions began deadlocking.

Fortunately our case was resolved quickly by determining the deadlock points and adjusting the indexes on a couple of tables to have a more rational column order. This greatly reduced our locking problems.