Sql-server – Clustered columnstore index performance SQL Server 2014

columnstoreolapsql server

I'm setting up an OLAP database using SQL Server 2014. The core fact table has about 40,000,000 rows, 225 columns with an average row size of 181 bytes. I've been toying around with the clustered columnstore index a bit without much luck. In general I find query performance more than 4 times slower with the new technology.

One particular example – selecting a single row using an int32 primary key now takes 12 seconds … this is a sub-second operation on a rowstore table (of course it had a unique index on the PK, which is not allowed in conjunction with the clustered columnstore index).

I'm trying to figure out what I'm doing wrong – from the MS docs it sounds like this is ideal technology for this task; maybe I'm missing something.

I'm running SQL 2014 Enterprise on Windows 8.1 64-bit with 128GB RAM and SSD for data storage. The data is read only for this app.

Best Answer

If you can post the specific data and queries you are using, that is probably the only way we can help answer the question in the context of your specific case. You can use a script that generates anonymous data in roughly the same scale as your real example.

However, I went ahead and created a similar type of script myself. For the sake of simplicity, I am using fewer than 225 columns. But I am using the same number of rows and random data (which is unfavorable for columnstore) and I saw results that are much different than yours. So my initial thought is that yes, you do have some sort of problem with either your configuration or your test queries.

A few of the key takeaways:

Columnstore has dramatically faster performance than rowstore for simple aggregations across all rows in a column
If loaded carefully, columnstore can perform surprisingly well for singleton seeks. There is an I/O hit, but with a warm cache performance was very good. But not as good as rowstore for this use case, of course.
If you need to be able to perform both singleton seeks and large aggregation queries, you might consider using a non-clustered columnstore index on top of a standard b-tree table.
You mention that you have 225 columns, but an average row is just 181 bytes. This seems a little unusual; is your table mostly BIT columns? That might be something to look into further. I did see very good compression ratios on a simple BIT column columnstore (over 99%), but it may be the case that much of that is due to the absence of row overhead and this advantage would disappear with many BIT columns on a single row.
If you want to learn (a lot) more about columnstore, Niko's 66-part (and counting) blog series has been the most valuable reference that I've come across.

And now on to the details:

Create rowstore data set

Nothing too exciting here; we create 40MM rows of pseudo-random data.

SELECT @@VERSION
--Microsoft SQL Server 2014 - 12.0.4213.0 (X64) 
--  Jun  9 2015 12:06:16 
--  Copyright (c) Microsoft Corporation
--  Developer Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)
GO

-- Create a rowstore table with 40MM rows of pseudorandom data
;WITH E1(N) AS (
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 
    UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)
, E2(N) AS (SELECT 1 FROM E1 a CROSS JOIN E1 b)
, E4(N) AS (SELECT 1 FROM E2 a CROSS JOIN E2 b)
, E8(N) AS (SELECT 1 FROM E4 a CROSS JOIN E4 b)
SELECT TOP 40000000 ISNULL(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), 0) AS id
    , ISNULL((ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) % 5) + 1, 0) AS col1
    , ISNULL(ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) * RAND(), 0) AS col2
    , ISNULL(ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) * RAND(), 0) AS col3
    , ISNULL(ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) * RAND(), 0) AS col4
    , ISNULL(ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) * RAND(), 0) AS col5
INTO dbo.test_row
FROM E8
GO
ALTER TABLE test_row
ADD CONSTRAINT PK_test_row PRIMARY KEY (id)
GO

Create columnstore data set

Let's create the same data set as a CLUSTERED COLUMNSTORE, using the techniques described to load data for better segment elimination on Niko's blog.

-- Create a columnstore table with the same 40MM rows
-- The data is first ordered by id and then a single thread
-- use to build the columnstore for optimal segment elimination
SELECT *
INTO dbo.test_column
FROM dbo.test_row
GO
CREATE CLUSTERED INDEX cs_test_column
ON dbo.test_column (id)
GO
CREATE CLUSTERED COLUMNSTORE INDEX cs_test_column 
ON dbo.test_column WITH (DROP_EXISTING = ON, MAXDOP = 1)
GO

Size comparison

Because we are loading random data, columnstore achieves only a modest reduction in table size. If the data was not as random, the columnstore compression would dramatically decrease the size of the columnstore index. This particular test case is actually quite unfavorable for columnstore, but it's still nice to see that we get a little bit of compression.

-- Check the sizes of the two tables
SELECT t.name, ps.row_count, (ps.reserved_page_count*8.0) / (1024.0) AS sizeMb
FROM sys.tables t WITH (NOLOCK)
JOIN sys.dm_db_partition_stats ps WITH (NOLOCK)
    ON ps.object_id = t.object_id
WHERE t.name IN ('test_row','test_column')
--name          row_count   sizeMb
--test_row      40000000    2060.6328125
--test_column   40000000    1352.2734375
GO

Performance comparison

In the following two test cases, I try two very different use cases.

The first is the singleton seek mentioned in your question. As commenters point out, this is not at all the use case for columnstore. Because an entire segment has to be read for each column, we see a much greater number of reads and slower performance from a cold cache (0ms rowstore vs. 273ms columnstore). However, columnstore is down to 2ms with a warm cache; that's actually quite an impressive result given that there is no b-tree to seek into!

In the second test, we compute an aggregate for two columns across all rows. This is more along the lines of what columnstore is designed for, and we can see that columnstore has fewer reads (due to compression and not needing to access all columns) and dramatically faster performance (primarily due to batch mode execution). From a cold cache, columnstore executes in 4s vs 15s for rowstore. With a warm cache, the difference is a full order of magnitude at 282ms vs 2.8s.

SET STATISTICS TIME, IO ON
GO

-- Clear cache; don't do this in production!
-- I ran this statement between each set of trials to get a fresh read
--CHECKPOINT
--DBCC DROPCLEANBUFFERS
GO

-- Trial 1: CPU time = 0 ms,  elapsed time = 0 ms.
    -- logical reads 4, physical reads 4, read-ahead reads 0
-- Trial 2: CPU time = 0 ms,  elapsed time = 0 ms
    -- logical reads 4, physical reads 0, read-ahead reads 0
SELECT *
FROM dbo.test_row
WHERE id = 12345678
GO 2
-- Trial 1: CPU time = 15 ms,  elapsed time = 273 ms..
    -- lob logical reads 9101, lob physical reads 1, lob read-ahead reads 25756
-- Trial 2: CPU time = 0 ms,  elapsed time = 2 ms.  
    -- lob logical reads 9101, lob physical reads 0, lob read-ahead reads 0
SELECT *
FROM dbo.test_column
WHERE id = 12345678
GO 2

-- Trial 1: CPU time = 8441 ms,  elapsed time = 14985 ms.
    -- logical reads 264733, physical reads 3, read-ahead reads 263720
-- Trial 2: CPU time = 9733 ms,  elapsed time = 2776 ms.
    -- logical reads 264883, physical reads 0, read-ahead reads 0
SELECT AVG(id), SUM(col3)
FROM dbo.test_row
GO 2
-- Trial 1: CPU time = 1233 ms,  elapsed time = 3992 ms.
    -- lob logical reads 207778, lob physical reads 1, lob read-ahead reads 341196
-- Trial 2: CPU time = 1030 ms,  elapsed time = 282 ms. 
    -- lob logical reads 207778, lob physical reads 0, lob read-ahead reads 0
SELECT AVG(id), SUM(col3)
FROM dbo.test_column
GO 2

Related Solutions

Sql-server – Clustered columnstore indexes and foreign keys

You've got lots of questions in here:

Q: (The lack of foreign keys) confuses me a lot! It is a good practice (not mandatory) to have Fk's in the DWH for a variety of reasons (data integrity, relations visible for semantic layer, ....)

A: Correct, it's normally a good practice to have foreign keys in a data warehouse. However, clustered columnstore indexes don't support that yet.

Q: So MS advocates Clustered Column store indexes for DWH scenarios, However it can not handle FK relationships?!

A: Microsoft gives you tools. It's up to you how you use those tools.

If your biggest challenge is a lack of data integrity in your data warehouse, then the tool you want is conventional tables with foreign keys.

If your biggest challenge is query performance, and you're willing to check your own data integrity as part of the loading process, then the tool you want is clustered columnstore indexes.

Q: However SQL 2014 than adds no real new value for DWH??

A: Thankfully, clustered columnstore wasn't the only new feature in SQL Server 2014. For example, check out the new cardinality estimator.

Q: Why am I so angry and bitter about the way my favorite feature was implemented?

A: You caught me - you didn't really ask that question - but I'll answer it anyway. Welcome to the world of third party software where not everything is built according to your exact specifications. If you feel passionately about a change you'd like to see in a Microsoft product, check out Connect.Microsoft.com. It's their feedback process where you can submit a change, other people can vote it up, and then the product team reads it and tells you why they won't implement it. Sometimes. Most of the time they just mark it as "won't fix, works on my machine" but hey, sometimes you do get some answers.

Sql-server – What exactly can SQL Server 2014 execute in batch mode

What exactly can run in batch mode as of SQL Server 2014?

SQL Server 2014 adds the following to the original list of batch mode operators:

Hash Outer join (including full join)
Hash Semi Join
Hash Anti Semi Join
Union All (Concatenation only)
Scalar hash aggregate (no group by)
Batch Hash Table Build removed

It seems that data can transition into batch mode even if it does not originate from a columnstore index.

SQL Server 2012 was very limited in its use of batch operators. Batch mode plans had a fixed shape, relied on heuristics, and could not restart batch mode once a transition to row-mode processing had been made.

SQL Server 2014 adds the execution mode (batch or row) to the query optimizer's general property framework, meaning it can consider transitioning into and out of batch mode at any point in the plan. Transitions are implemented by invisible execution mode adapters in the plan. These adapters have a cost associated with them to limit the number of transitions introduced during optimization. This new flexible model is known as Mixed Mode Execution.

The execution mode adapters can be seen in the optimizer's output (though sadly not in user-visible execution plans) with undocumented TF 8607. For example, the following was captured for a query counting rows in a row store:

Row to Batch to Row adapters

Is using a columnstore index a formal requirement that is necessary to make SQL Server consider batch mode?

It is today, yes. One possible reason for this restriction is that it naturally constrains batch mode processing to Enterprise Edition.

Could we maybe add a zero row dummy table with a columnstore index to induce batch mode?

Yes, this works. I have also seen people cross-joining with a single-row clustered columnstore index for just this reason. The suggestion you made in the comments to left join to a dummy columnstore table on false is terrific.

-- Demo the technique (no performance advantage in this case)
--
-- Row mode everywhere
SELECT COUNT_BIG(*) FROM dbo.FactOnlineSales AS FOS;
GO
-- Dummy columnstore table
CREATE TABLE dbo.Dummy (c1 int NULL);
CREATE CLUSTERED COLUMNSTORE INDEX c ON dbo.Dummy;
GO
-- Batch mode for the partial aggregate
SELECT COUNT_BIG(*) 
FROM dbo.FactOnlineSales AS FOS
LEFT OUTER JOIN dbo.Dummy AS D ON 0 = 1;

Plan with dummy left outer join:

Documentation is thin

True.

The best official sources of information are Columnstore Indexes Described and SQL Server Columnstore Performance Tuning.

SQL Server MVP Niko Neugebauer has a terrific series on columnstore in general here.

There are some good technical details about the 2014 changes in the Microsoft Research paper, Enhancements to SQL Server Column Stores (pdf) though this is not official product documentation.

Best Answer

Related Solutions

Sql-server – Clustered columnstore indexes and foreign keys

Sql-server – What exactly can SQL Server 2014 execute in batch mode

Related Question