Sql-server – Creating additional indexes on a table with a columnstore clustered index

columnstorenonclustered-indexsql serversql-server-2016

According to this article, a table with a clustered columnstore index can have one or more additional non-clustered indexes. However, when I try to add a non-clustered index on a table with a CCI, I get the error shown in the image below:

Am I overlooking something, or misinterpreting something? I'm using SQL Server 2016 with SP1.

Best Answer

You can create additional nonclustered indexes on tables with clustered columnstore indexes, the following script shows this...

CREATE TABLE dbo.T2 (C1 INT NOT NULL, C2 INT, C3 INT);
GO
CREATE CLUSTERED COLUMNSTORE INDEX CCI ON dbo.T2;
GO
CREATE NONCLUSTERED INDEX IX ON dbo.T2 (C2);
GO

I have attempted the same thing using SSMS 16.5.3. It is possible that there is a bug in the version you are using (latest is at https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms ).

Consider creating the script using TSQL rather than the GUI to see if you get the same result.

Related Solutions

Sql-server – Code creating clustered columnstore index while maintaining row order

A clustered columnstore index is fundamentally different from a clustered rowstore index. You may have noticed there is no key column specification for a clustered columnstore index. That's right: a clustered columnstore index is an index with no keys - all columns are 'included'.

The most intuitive description I have heard for a clustered columnstore index is to think of it as a column-oriented heap table (where the 'RID' is rowgroup_id, row_number).

If you need indexes to support direct ordering and/or point/small range selections, you can create updateable rowstore b-tree indexes on top of clustered columnstore in SQL Server 2016.

In many cases this is simply not necessary, since columnstore access and batch mode sorting is so fast. Many of the things people 'know' about rowstore performance need to be relearned for columnstore. Scans and hashes are good :)

That said, of course columnstore has a structure to its row groups (and metadata about min/max values in each segment), which can be useful in queries that can benefit from row group/segment elimination.

One important technique in this area is to first create a clustered rowstore index with the desired ordering, then create the clustered columnstore index using the WITH (DROP_EXISTING = ON, MAXDOP = 1) option. In your example:

CREATE [UNIQUE] CLUSTERED INDEX idx 
ON dbo.tab1_cstore (id, time)
WITH (MAXDOP = 1);

CREATE CLUSTERED COLUMNSTORE INDEX idx 
ON dbo.tab1_cstore
WITH (DROP_EXISTING = ON, MAXDOP = 1);

Care is needed to maintain the benefits of row group/segment elimination over time. Also, while columnstore is already implicitly partitioned by row group, but you can explicitly partition it as well.

I'm not 100% sure what you're looking to test, but it is true that the 'order' of values within a segment is determined by the compression algorithm. My point about creating the columnstore index with DROP_EXISTING is about the ordering of data flowing into the segment creation process, so that segments overall will be ordered in a particular way. Within the segment, all bets are off.

Sql-server – Nonclustered index storage on clustered columnstore

The "bookmark" is the columnstore index original locator (per "Pro SQL Server Internals" by Dmitri Korotkevitch). This is an 8-byte value, with the columnstore index's row_group_id in the first 4-bytes and an offset in the second 4-bytes.
If you use DBCC PAGE to look at the non-clustered index, the 8-byte columnstore index original locator appears in the "uniquifier" column of the DBCC PAGE output. This shows that a unique non-clustered index does not need to include the columnstore row locator, whereas a non-unique non-clustered index does.

The following code creates a columnstore-organized table with a unique and non-unique b-tree nonclustered index on the same column:

CREATE TABLE dbo.Heapish
(
    c1 bigint NOT NULL,
    c2 bigint NOT NULL,
    INDEX CCI_dbo_Heapish CLUSTERED COLUMNSTORE
);
GO
INSERT dbo.Heapish WITH (TABLOCKX)
    (c1, c2)
SELECT TOP (1024 * 1024 * 8)
    c1 = ROW_NUMBER() OVER
        (ORDER BY C1.[object_id], C1.column_id),
    c2 = ROW_NUMBER() OVER
        (ORDER BY C1.[object_id], C1.column_id)
FROM master.sys.columns AS C1
CROSS JOIN master.sys.columns AS C2
ORDER BY
    c1
OPTION (MAXDOP 1);
GO
CREATE UNIQUE NONCLUSTERED INDEX UNIQUE_c2 ON dbo.Heapish (c2) WITH (MAXDOP = 1);
CREATE NONCLUSTERED INDEX NONUNIQUE_c2 ON dbo.Heapish (c2) WITH (MAXDOP = 1);

We can see the size of the index row at different levels of the b-tree using sys.dm_db_index_physical_stats:

SELECT
    DDIPS.index_level,
    DDIPS.page_count,
    DDIPS.record_count,
    DDIPS.min_record_size_in_bytes,
    DDIPS.max_record_size_in_bytes
FROM sys.dm_db_index_physical_stats
(
    DB_ID(),
    OBJECT_ID(N'dbo.Heapish', N'U'),
    INDEXPROPERTY(OBJECT_ID(N'dbo.Heapish', N'U'), N'UNIQUE_c2', 'IndexID'),
    NULL, 'DETAILED'
) AS DDIPS;

SELECT
    DDIPS.index_level,
    DDIPS.page_count,
    DDIPS.record_count,
    DDIPS.min_record_size_in_bytes,
    DDIPS.max_record_size_in_bytes
FROM sys.dm_db_index_physical_stats
(
    DB_ID(),
    OBJECT_ID(N'dbo.Heapish', N'U'),
    INDEXPROPERTY(OBJECT_ID(N'dbo.Heapish', N'U'), N'NONUNIQUE_c2', 'IndexID'),
    NULL, 'DETAILED'
) AS DDIPS;

The output is:

Both structures have the same row size at the leaf level, but the nonunique nonclustered index is 12 bytes larger than the unique nonclustered index at the non-leaf levels due to the 8-byte columnstore locator, plus 4 bytes of overhead for the first variable-length column in a row (uniquifier is variable length).

Best Answer

Related Solutions

Sql-server – Code creating clustered columnstore index while maintaining row order

Sql-server – Nonclustered index storage on clustered columnstore

Related Question