I have a table with a non-nullable VARCHAR(MAX)
column where every row contains an empty string (third party database so I can't change it). I see a huge amount of space use, over 15GB for 1 million rows.
If I run the following query I see a huge number of used_pages for the LOB_DATA even though there is no data in that column. Excluding the VARCHAR(MAX)
the row size is under 8000 bytes.
SELECT o.name AS table_name,p.index_id, au.type, au.type_desc, au.total_pages, au.used_pages, au.data_pages
FROM sys.allocation_units AS au
JOIN sys.partitions AS p ON au.container_id = p.partition_id
JOIN sys.objects AS o ON p.object_id = o.object_id
WHERE o.name = 'table
ORDER BY o.name, p.index_id;
type type_desc total_pages used_pages data_pages
1 IN_ROW_DATA 23258 23252 23188
2 LOB_DATA 1880733 1880455 0
3 ROW_OVERFLOW_DATA 0 0 0
Updates from comments:
- The table is an archive table so the data should never have changed (only inserted)
- I have an identical table in a seperate database (same server) with similar data that doesn't exhibit this problem
- Rebuilding the index didn't help
- large_value_types_out_of_row is 0
- DBCC CLEANTABLE didn't help
Best Answer
Is it possible the column was populated and it is has since been cleared out, or the data type was recently changed to varchar(max)? Have you tried rebuilding the clustered index (which will touch every row, unlike some ALTERs)?
EDIT
Since rebuilding the clustered index didn't help, I'm at a loss, and since the size of the existing data is so small, I recommend just creating a new version of the table and moving the data over. You can do this by explicitly re-creating the table and then copying the data, or you can use
SELECT INTO
to minimize the steps.