Sql-server – Empty strings in VARCHAR(MAX) consuming space which cannot be recovered

disk-spacesql-server-2008

I have a table with a non-nullable VARCHAR(MAX) column where every row contains an empty string (third party database so I can't change it). I see a huge amount of space use, over 15GB for 1 million rows.

If I run the following query I see a huge number of used_pages for the LOB_DATA even though there is no data in that column. Excluding the VARCHAR(MAX) the row size is under 8000 bytes.

 SELECT o.name AS table_name,p.index_id, au.type, au.type_desc, au.total_pages, au.used_pages, au.data_pages
 FROM sys.allocation_units AS au
   JOIN sys.partitions AS p ON au.container_id = p.partition_id
   JOIN sys.objects AS o ON p.object_id = o.object_id
 WHERE o.name = 'table
 ORDER BY o.name, p.index_id;

type    type_desc             total_pages   used_pages  data_pages
1   IN_ROW_DATA 23258             23252           23188
2   LOB_DATA              1880733             1880455             0
3   ROW_OVERFLOW_DATA   0             0           0

Updates from comments:

  • The table is an archive table so the data should never have changed (only inserted)
  • I have an identical table in a seperate database (same server) with similar data that doesn't exhibit this problem
  • Rebuilding the index didn't help
  • large_value_types_out_of_row is 0
  • DBCC CLEANTABLE didn't help

Best Answer

Is it possible the column was populated and it is has since been cleared out, or the data type was recently changed to varchar(max)? Have you tried rebuilding the clustered index (which will touch every row, unlike some ALTERs)?

EDIT

Since rebuilding the clustered index didn't help, I'm at a loss, and since the size of the existing data is so small, I recommend just creating a new version of the table and moving the data over. You can do this by explicitly re-creating the table and then copying the data, or you can use SELECT INTO to minimize the steps.

SELECT ... other cols ...,
    varchar_max_col = CONVERT(VARCHAR(MAX), NULL) 
    INTO dbo.newtable FROM dbo.oldtable;
CREATE CLUSTERED INDEX ... ON dbo.newtable( ...cols... );
CREATE INDEX ... ON dbo.newtable( ...cols... );
...
GO
BEGIN TRANSACTION;
EXEC sp_rename 'dbo.oldtable', 'oldtable_backup', OBJECT;
EXEC sp_rename 'dbo.newtable', 'oldtable', OBJECT;
COMMIT TRANSACTION;