Sql-server – Does column store index reduce data space ? How this is possible

columnstoresql server

I was testing for performance of one of my queries. I was trying different indexes. When I applied a column-store index on my table, the table size reduced by 70-80 %. How is this possible?

Best Answer

A columnstore arranges the data on disk differently to how a "normal" table does it. The column's values are split into segments of just over one million values. Each segment is compressed. Since a single column's values can show a lot of repitition (think "country code" or "product name") the compression ratio can be significant.

Read performance can improve from several factors. First, only the columns required in the query are read off disk. Second, compression means much less IO for a given number of values compared to rowstore. Third, aggregate functions can be performed in what's called "batch mode," which is optimised for CPU cache utilisation.

Compression is also available for rowstores. My experience is that CPU utilisation increases but IO drops, for a nett improvement in elapsed query time. This was for moderately large databases performing reporting & analytics.

Of course, your mileage may vary.