I've a table
MESSAGES
id (long) | attach_data (image) | ins_date | flag
this table contains all messages of my application. During night, my job cleans old attach_data with query
update messages set attach_data=0x
where ins_date< DATEADD(DAY, -120, GETDATE())
and datalength(image)> 0
this is very long query (about 2 hours)
Other processes are always querying same table with
select id FROM MESSAGES WHERE Flag = 0 ORDER BY ins_date ASC
But during this 2 night hours, this query is not function (timeout expired).
There are a tricks?
I'm running Sql server 2005
Best Answer
[Note: I got to the end of writing this and noticed that sometimes you refer to the data field as
image
and sometimes asattach_data
. You'll need to take that into account as you read this.]You might try breaking up the update statement into multiple batches with TOP:
Then you'll need to modify your maintenance window to execute this query repeatedly until no records are returned. You also might want to use a static date parameter specified by the application since the dates will create a moving target as you re-run this query.
Note that this method will cause the maintenance process to take longer to process overall, but each batch should complete more quickly and therefore you shouldn't be blocking reads as long. You can make the whole process faster by increasing the size of the TOP, but each transaction will lock longer. It's a tradeoff you'll have to play with.
Note also that you can't easily specify an order the records will be updated here since UPDATE doesn't support ORDER BY; you'll just have to keep repeating it until it finishes.
If that doesn't work for you, then there are some optimizations you could make. I would guess that part of the issue is that
DATALENGTH()
is not a very fast function since it has to read the data. Do you really need to test data length here? Is what you're doing significantly different than this:If you must use
DATALENGTH()
or the above doesn't work, its worth noticing that it is deterministic. That means you can index it. It may be worthwhile to create an indexed view, or a computed column in the table which hasDATALENGTH()
that you can then index.Something like this:
Followed by a unique clustered index on
id
, and then a nonclustered index onimage_datalength
,ins_date
. Then you can run:Adding a computed column and an index on that would be similar.
Note that this method does add a number of indexes that now have to be maintained by the DB, so it isn't free.