A clustered columnstore index is fundamentally different from a clustered rowstore index. You may have noticed there is no key column specification for a clustered columnstore index. That's right: a clustered columnstore index is an index with no keys - all columns are 'included'.
The most intuitive description I have heard for a clustered columnstore index is to think of it as a column-oriented heap table (where the 'RID' is rowgroup_id, row_number
).
If you need indexes to support direct ordering and/or point/small range selections, you can create updateable rowstore b-tree indexes on top of clustered columnstore in SQL Server 2016.
In many cases this is simply not necessary, since columnstore access and batch mode sorting is so fast. Many of the things people 'know' about rowstore performance need to be relearned for columnstore. Scans and hashes are good :)
That said, of course columnstore has a structure to its row groups (and metadata about min/max values in each segment), which can be useful in queries that can benefit from row group/segment elimination.
One important technique in this area is to first create a clustered rowstore index with the desired ordering, then create the clustered columnstore index using the WITH (DROP_EXISTING = ON, MAXDOP = 1)
option. In your example:
CREATE [UNIQUE] CLUSTERED INDEX idx
ON dbo.tab1_cstore (id, time)
WITH (MAXDOP = 1);
CREATE CLUSTERED COLUMNSTORE INDEX idx
ON dbo.tab1_cstore
WITH (DROP_EXISTING = ON, MAXDOP = 1);
Care is needed to maintain the benefits of row group/segment elimination over time. Also, while columnstore is already implicitly partitioned by row group, but you can explicitly partition it as well.
I'm not 100% sure what you're looking to test, but it is true that the 'order' of values within a segment is determined by the compression algorithm. My point about creating the columnstore index with DROP_EXISTING
is about the ordering of data flowing into the segment creation process, so that segments overall will be ordered in a particular way. Within the segment, all bets are off.
The "bookmark" is the columnstore index original locator (per "Pro SQL Server Internals" by Dmitri Korotkevitch). This is an 8-byte value, with the columnstore index's row_group_id
in the first 4-bytes and an offset in the second 4-bytes.
If you use DBCC PAGE
to look at the non-clustered index, the 8-byte columnstore index original locator appears in the "uniquifier" column of the DBCC PAGE
output. This shows that a unique non-clustered index does not need to include the columnstore row locator, whereas a non-unique non-clustered index does.
The following code creates a columnstore-organized table with a unique and non-unique b-tree nonclustered index on the same column:
CREATE TABLE dbo.Heapish
(
c1 bigint NOT NULL,
c2 bigint NOT NULL,
INDEX CCI_dbo_Heapish CLUSTERED COLUMNSTORE
);
GO
INSERT dbo.Heapish WITH (TABLOCKX)
(c1, c2)
SELECT TOP (1024 * 1024 * 8)
c1 = ROW_NUMBER() OVER
(ORDER BY C1.[object_id], C1.column_id),
c2 = ROW_NUMBER() OVER
(ORDER BY C1.[object_id], C1.column_id)
FROM master.sys.columns AS C1
CROSS JOIN master.sys.columns AS C2
ORDER BY
c1
OPTION (MAXDOP 1);
GO
CREATE UNIQUE NONCLUSTERED INDEX UNIQUE_c2 ON dbo.Heapish (c2) WITH (MAXDOP = 1);
CREATE NONCLUSTERED INDEX NONUNIQUE_c2 ON dbo.Heapish (c2) WITH (MAXDOP = 1);
We can see the size of the index row at different levels of the b-tree using sys.dm_db_index_physical_stats
:
SELECT
DDIPS.index_level,
DDIPS.page_count,
DDIPS.record_count,
DDIPS.min_record_size_in_bytes,
DDIPS.max_record_size_in_bytes
FROM sys.dm_db_index_physical_stats
(
DB_ID(),
OBJECT_ID(N'dbo.Heapish', N'U'),
INDEXPROPERTY(OBJECT_ID(N'dbo.Heapish', N'U'), N'UNIQUE_c2', 'IndexID'),
NULL, 'DETAILED'
) AS DDIPS;
SELECT
DDIPS.index_level,
DDIPS.page_count,
DDIPS.record_count,
DDIPS.min_record_size_in_bytes,
DDIPS.max_record_size_in_bytes
FROM sys.dm_db_index_physical_stats
(
DB_ID(),
OBJECT_ID(N'dbo.Heapish', N'U'),
INDEXPROPERTY(OBJECT_ID(N'dbo.Heapish', N'U'), N'NONUNIQUE_c2', 'IndexID'),
NULL, 'DETAILED'
) AS DDIPS;
The output is:
Both structures have the same row size at the leaf level, but the nonunique nonclustered index is 12 bytes larger than the unique nonclustered index at the non-leaf levels due to the 8-byte columnstore locator, plus 4 bytes of overhead for the first variable-length column in a row (uniquifier is variable length).
Best Answer
You can create additional nonclustered indexes on tables with clustered columnstore indexes, the following script shows this...
I have attempted the same thing using SSMS 16.5.3. It is possible that there is a bug in the version you are using (latest is at https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms ).
Consider creating the script using TSQL rather than the GUI to see if you get the same result.