Sql-server – Clustered columnstore indexes and foreign keys

columnstoredata-warehouseforeign keysql serversql server 2014

I am performance tuning a data warehouse using indexes. I am fairly new to SQL Server 2014.Microsoft describes the following:

"We view the clustered columnstore index as the standard for storing large data warehousing fact tables, and expect it will be used in most data warehousing scenarios. Since the clustered columnstore index is updateable, your workload can perform a large number of insert, update, and delete operations." http://msdn.microsoft.com/en-us/library/gg492088.aspx

However if you read further in the documentation you will find under limitations and restrictions:

"Cannot have unique constraints, primary key constraints, or foreign key constraints."

This confuses me a lot! It is a good practice (not mandatory) to have foreign keys in the data warehouse for a variety of reasons (data integrity, relations visible for semantic layer …)

So Microsoft advocates clustered columnstore indexes for data warehouse scenarios; however, it can not handle foreign key relationships?!

Am I correct on this? Which other approaches would you advise? In the past, I have used a nonclustered columnstore index in data warehouse scenarios, with drop and rebuild for data loads. However SQL Server 2014 then adds no real new value for data warehouses??

Best Answer

You've got lots of questions in here:

Q: (The lack of foreign keys) confuses me a lot! It is a good practice (not mandatory) to have Fk's in the DWH for a variety of reasons (data integrity, relations visible for semantic layer, ....)

A: Correct, it's normally a good practice to have foreign keys in a data warehouse. However, clustered columnstore indexes don't support that yet.

Q: So MS advocates Clustered Column store indexes for DWH scenarios, However it can not handle FK relationships?!

A: Microsoft gives you tools. It's up to you how you use those tools.

If your biggest challenge is a lack of data integrity in your data warehouse, then the tool you want is conventional tables with foreign keys.

If your biggest challenge is query performance, and you're willing to check your own data integrity as part of the loading process, then the tool you want is clustered columnstore indexes.

Q: However SQL 2014 than adds no real new value for DWH??

A: Thankfully, clustered columnstore wasn't the only new feature in SQL Server 2014. For example, check out the new cardinality estimator.

Q: Why am I so angry and bitter about the way my favorite feature was implemented?

A: You caught me - you didn't really ask that question - but I'll answer it anyway. Welcome to the world of third party software where not everything is built according to your exact specifications. If you feel passionately about a change you'd like to see in a Microsoft product, check out Connect.Microsoft.com. It's their feedback process where you can submit a change, other people can vote it up, and then the product team reads it and tells you why they won't implement it. Sometimes. Most of the time they just mark it as "won't fix, works on my machine" but hey, sometimes you do get some answers.