Sql-server – Performance Implications of encrypted data in a Clustered Columnstore Index

columnstoreencryptionsql serversql-server-2016

I am using SQL Server 2016 and expecting some details on below points regarding performance point of view:

  1. Is there any performance gain or additional benefit on creating a primary key – unique constraint on same table on which a clustered columnstore index is already created?

  2. If my table contains encrypted columns (out of 10 columns, 9 columns are encrypted), is there any benefit in case of creating clustered columnstore index? I am using SYMMETRIC KEY for encryption.

Best Answer

It's best to avoid asking multiple, unrelated questions in a single question when possible. Your first question is very broad. In general, creating an index is always a trade-off and some queries will benefit from that index. The same is true for NCIs on columnstore tables. I'll focus on the second question instead.

As far as I can tell, encryption by SYMMETRIC KEY can be a very poor match for columnstore. Inserting encrypted data into a CCI means that you're inserting long, relatively unique strings into varbinary columns. CCIs don't do well with those data types and data distributions. You're likely to hit dictionary pressure which will limit the size of your rowgroups and you'll miss out on most of the features of columnstore, except for column elimination.

I found the code at How to use SQL Server Encryption with Symmetric Keys helpful in developing a simple example. I don't know what your data looks like, so I'll just insert unique integers into a table and insert the same unique integers with encryption into a different table. Definition of the key:

CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'you will never guess this';

CREATE CERTIFICATE MyFirstCertificate
WITH SUBJECT = 'no u';

CREATE SYMMETRIC KEY key_to_answering_217242 WITH
IDENTITY_VALUE = 'yeah nah',
ALGORITHM = AES_256,
KEY_SOURCE = 'closet full of jandals'
ENCRYPTION BY CERTIFICATE MyFirstCertificate;

T-SQL to create two tables and insert data into both of them:

DROP TABLE IF EXISTS dbo.CCI_NOT_ENCRYPTED;

CREATE TABLE dbo.CCI_NOT_ENCRYPTED (
ID BIGINT,
INDEX CCI CLUSTERED COLUMNSTORE
);

INSERT INTO dbo.CCI_NOT_ENCRYPTED WITH (TABLOCK)
SELECT TOP (1048576) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM master..spt_values t1
CROSS JOIN master..spt_values t2
OPTION (MAXDOP 1);

GO

USE master;
GO

OPEN SYMMETRIC KEY key_to_answering_217242
DECRYPTION BY CERTIFICATE MyFirstCertificate


DROP TABLE IF EXISTS [TEST].dbo.CCI_ENCRYPTED;

CREATE TABLE [TEST].dbo.CCI_ENCRYPTED (
ID VARBINARY(256),
INDEX CCI CLUSTERED COLUMNSTORE
);

INSERT INTO [TEST].dbo.CCI_ENCRYPTED WITH (TABLOCK)
SELECT EncryptByKey(Key_GUID('key_to_answering_217242'), CAST(RN AS VARCHAR(7)))
FROM (
    SELECT TOP (1048576) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) RN
    FROM master..spt_values t1
    CROSS JOIN master..spt_values t2
) q
OPTION (MAXDOP 1);

CLOSE SYMMETRIC KEY key_to_answering_217242;

GO

There's a clear difference in total table size. The table with encrypted data is over 20X bigger:

size difference

Looking at sys.dm_db_column_store_row_group_physical_stats, we can see that the table with encrypted data had its rowgroup size limited by dictionary pressure:

dict pressure

Query performance will of course be greatly impacted as well. The following query takes approximately 0 ms of CPU time:

SELECT COUNT(*)
FROM dbo.CCI_NOT_ENCRYPTED
WHERE ID = 1;

If I run that same query on the encrypted table:

USE master;
GO

OPEN SYMMETRIC KEY key_to_answering_217242
DECRYPTION BY CERTIFICATE MyFirstCertificate

SELECT COUNT(*)
FROM [TEST].dbo.CCI_ENCRYPTED
WHERE CAST(CAST(DecryptByKey(ID) AS VARCHAR(7)) AS BIGINT) = 1;

CLOSE SYMMETRIC KEY key_to_answering_217242;

GO

It takes over 600 ms of CPU time. With all of that said, there appears to be a large overhead in general in working with encrypted data. I cannot say if a columnstore will offer any benefit for your exact data and scenario, only that you definitely won't see the usual benefits of columnstore.