SQL Server – Why Compressing an Existing Table Results in No Change to Table Size

compressionsql server

There is an existing table in SQL 2012 Enterprise DB, which is ~120MB in size with ~2000 records. Its largest field is nvarchar(max) containing spongy XML (ranging from 20 to 300 kB) which on its own compresses roughly by 50-75% in 7Zip or RAR.

I tried the following commands which return instantaneously, even though simply querying * from the table takes quite a few seconds:

alter table sch.table1 REBUILD PARTITION=ALL WITH (DATA_COMPRESSION=ROW)
alter table sch.table1 REBUILD PARTITION=ALL WITH (DATA_COMPRESSION=PAGE)
alter table sch.table1 REBUILD PARTITION=ALL WITH (DATA_COMPRESSION=NONE)

But the size is not changing after any of them, even though the properties of the table correctly show the compression type.

Is the server actually applying and removing compression to the rows/pages? It seemingly does not have time to accomplish that in the split second that it takes each command to execute.

Best Answer

Row compression and page compression have a number of limitations and won't always reduce the space used by your table. Let's walk through a simple example. Suppose I insert a bunch of strings of length 4030 into a table. Applying page compression results in a space reduction from 4000 KB to 24 KB:

CREATE TABLE dbo.ZZZZZ_CI (
ID INT NOT NULL IDENTITY (1, 1),
FLUFF VARCHAR(4050),
PRIMARY KEY (ID)
);

SET NOCOUNT ON;

GO
-- insert 1000 rows
INSERT INTO dbo.ZZZZZ_CI VALUES (REPLICATE('Z', 4030));
GO 1000

EXEC sp_spaceused 'dbo.ZZZZZ_CI'; -- 4000 KB for data

ALTER TABLE dbo.ZZZZZ_CI REBUILD WITH (DATA_COMPRESSION=PAGE);

EXEC sp_spaceused 'dbo.ZZZZZ_CI'; -- now 24 KB for data

However, if I insert strings of length 4050 I get no space savings from page compression, even though the strings are all the same character!

-- remove data and compression
TRUNCATE TABLE dbo.ZZZZZ_CI;
ALTER TABLE dbo.ZZZZZ_CI REBUILD WITH (DATA_COMPRESSION=NONE);

GO
-- insert 1000 rows
INSERT INTO dbo.ZZZZZ_CI VALUES (REPLICATE('Z', 4050));
GO 1000

EXEC sp_spaceused 'dbo.ZZZZZ_CI'; -- 8000 KB for data

ALTER TABLE dbo.ZZZZZ_CI REBUILD WITH (DATA_COMPRESSION=PAGE);

EXEC sp_spaceused 'dbo.ZZZZZ_CI'; -- still 8000 KB for data

For your table you're probably running into a different limitation. A table in SQL Server can only contain 8060 bytes per row. Any data that cannot fit is stored separately as ROW_OVERFLOW_DATA. Your average row size is around 64 KB and your minimum row size is 20 KB. That means that all of your XML data cannot be stored in row. Data compression does not work on data stored out of the row:

SQL Server compresses in-row LOB data but not if the LOB data is stored out of row. With the result many LOB heavy applications are unable to take full advantage of data compression. However, there are two workaround available. First, LOB data can be compressed at the application tier but this means (a) the application needs to be modified (b) the application cannot take advantage of search and partial update capabilities provided in SQL Engine. Second, use filestream feature to store LOB data on a compressed volume. This is our recommended solution where applicable. Please refer to Books-Online for details on filestream feature.

Both of those workarounds are available in SQL Server 2012. In SQL Server 2016 there is an additional option with the introduction of GZIP. That can compress data that otherwise would have been stored out of the row. The data from before could not be page compressed, but GZIP reduces space needed to 64 KB:

CREATE TABLE dbo.ZZZZZ_CI_2016 (
ID INT NOT NULL IDENTITY (1, 1),
FLUFF_COMPRESSED varbinary(max),
PRIMARY KEY (ID)
);

GO
-- insert 1000 compressed rows
INSERT INTO dbo.ZZZZZ_CI_2016 VALUES (COMPRESS(REPLICATE('Z', 4050)));
GO 1000

EXEC sp_spaceused 'dbo.ZZZZZ_CI_2016';  -- now 64 KB for data

SELECT COUNT(*) -- 1000 is returned
FROM ZZZZZ_CI_2016
WHERE DECOMPRESS(FLUFF_COMPRESSED) = REPLICATE('Z', 4050);

Even very long strings can be compressed nicely:

TRUNCATE TABLE dbo.ZZZZZ_CI_2016;

GO
-- insert 1000 compressed rows
INSERT INTO dbo.ZZZZZ_CI_2016 VALUES (COMPRESS(REPLICATE(CAST('Z' AS VARCHAR(MAX)), 99999)));
GO 1000

EXEC sp_spaceused 'dbo.ZZZZZ_CI_2016';  -- now 152 KB for for 99999 length strings