I have a table with a column of type NVarChar(Max).
I didn't take good precautions before allowing users to add data so right now I have more than 2.5 billion rows and the database is becoming super large.
I'm trying to alter the column to change the size to a fixed length where records > that width should be deleted.
This is not an indexed column.
I tried LEN(Text)>width but this is a very slow function as it would scan billions of rows.
We've also experimented later with changing Int to BigInt by creating a new table and moving data there but it took a week or so.
Alter Column would freeze the system.
Can you kindly suggest any other way or suggest the method you feel is best?
The database is hosted on Azure SQL
Thank you so much 🙂
Best Answer
The best method is always specific to your requirements and situation. Some approaches might be faster than others but require more temporary space or disallow end users from querying the table. Based on what you've said in your question I'm going to assume the following:
One approach to fix it:
Verify that your desired maximum length will free up enough space. If a simple
COUNT_BIG(*)
is too slow you can useTABLESAMPLE
and a bit of math to figure out how many rows you'll delete. I created a 100 GB table on my local machine and running a 1% sample query took less than a second with a cold cache. Example code below:Notice the use of
DATALENGTH
instead ofLEN
. See Datalength optimizations for LOB data types… for the reasons you might find that performs better.Delete rows from the table in small batches. Use the clustered key to loop over all rows exactly once. If you'd like more detail here, try Take Care When Scripting Batches. Otherwise, the core idea is to increment local variables as you process the table:
Once all of the deletes are done, stop and measure your progress. Is the table small enough now? You may need to run a
REORGANIZE
orREBUILD
to fully reclaim space. I recommendREORGANIZE
because it's not an all-or-nothing operation (if you cancel it you keep your progress), it won't grow your log file, and it's an online operation.Stop here. If the problem is that the table takes up too much space then you've solved that the problem by removing unneeded rows from the table. You probably do not need to change the data type in order to reclaim space. Having an unnecessary
NVARCHAR(MAX)
isn't ideal, but the downsides may not justify the effort in changing the data type. If you do need to change the data type then it will be easier to do so after you've freed up some space in your database.