Sql-server – sparse column or DATA_COMPRESSION

compressionperformancequery-performancesparse-columnsql serversql-server-2016

in one of my databases I have the following table:

CREATE TABLE [app].[applicantSkill](
    [ApplicantSkillID] [int] IDENTITY(1,1) NOT NULL,
    [applicantID] [int] NOT NULL,
    [skillID] [tinyint] NOT NULL,
    [skillDetails] [varchar](500) NULL,
    [skillLevelID] [tinyint] SPARSE  NULL,
    [dateAdded] [datetime2](7) NOT NULL,
    [lastModified] [datetime2](7) NOT NULL,
 CONSTRAINT [PK_tbl_applicant_skill] PRIMARY KEY CLUSTERED 
(
    [ApplicantSkillID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [UserFG],
 CONSTRAINT [uc_appSkillID] UNIQUE NONCLUSTERED 
(
    [applicantID] ASC,
    [skillID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [UserFG]
) ON [UserFG]
GO

which has 1 sparse columnskillLevelID

when I have to re-create my index:

CREATE NONCLUSTERED INDEX I_applicantID  
ON [app].[applicantSkill] (  [applicantID] ASC  , [dateAdded] ASC  )  
INCLUDE ( [ApplicantSkillID] , [skillDetails] , [skillID] , [skillLevelID])  WITH (  PAD_INDEX = OFF, FILLFACTOR = 100  , SORT_IN_TEMPDB = OFF , IGNORE_DUP_KEY = OFF, 
STATISTICS_NORECOMPUTE = OFF, ONLINE = OFF, DROP_EXISTING = OFF, DATA_COMPRESSION=PAGE, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON ) ON [NONCLUSTERED_INDEXES] 

I get the following error message:

Msg 10622, Level 16, State 1, Line 18
The index 'I_applicantID' could not be created or rebuilt. A compressed index is not supported on table that contains sparse columns or a column set column.

How can I find out which one the sparse column or the data_compression should I keep?

Best Answer

In the MS documentation, there is a chart to estimate space savings for sparse columns.

According to said chart, if at least 86% of the values are NULL, you will have an overall savings of at least 40% for this column. Each TINYINT value is going to take up 5 bytes versus the normal 1 byte in this scenario.

You can compare this with the compression savings by using sp_estimate_data_compression_savings for the proposed index.

This should give you enough information to make an informed decision about which is the right approach.

Resources:

  • sp_sizeoptimiser Uses statistics to automatically recommend changing a column to sparse if it meets the threshold to reclaim at least 40% space. Disclaimer: I wrote it!