Sql-server – Columnstore index query performance based on number of columns in SELECT statement

columnstoreperformancequery-performancesql serversql-server-2016

I have a table 'FactsTable1' which is horizontally partitioned on basis of date & it also has clustered columnstore index on it. I am running following queries on it.

Query 1 :

SELECT 
    SUM([Column1]), 
    SUM([Column2]), 
    SUM([Column3]), 
    SUM([Column4]), 
    SUM([Column5]), 
    SUM([Column6]) 
FROM [FactsTable1]

Query 2 :

SELECT 
    SUM([Column1]) 
FROM [FactsTable1]

When I ran above 2 queries with 'STATISTICS IO,TIME ON', I received following as output.

Query 1 :

SQL Server parse and compile time: 
   CPU time = 94 ms, elapsed time = 95 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)
Table 'FactsTable1'. Scan count 1, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 1184, lob physical reads 31, lob read-ahead reads 0.
Table 'FactsTable1'. Segment reads 22, segment skipped 0.

 SQL Server Execution Times:
   CPU time = 218 ms,  elapsed time = 247 ms.

Query 2 :

SQL Server parse and compile time: 
   CPU time = 46 ms, elapsed time = 57 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)
Table 'FactsTable1'. Scan count 1, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 198, lob physical reads 0, lob read-ahead reads 0.
Table 'FactsTable1'. Segment reads 22, segment skipped 0.

 SQL Server Execution Times:
   CPU time = 47 ms,  elapsed time = 52 ms.

I can see some difference always in CPU time when checked multiple times but I dont see any difference in query cost. Its always 50-50%.

So I want to know, does number of columns affect performance of a query for a partitioned table with clustered columnstore index? If yes, then is there any way to measure performance difference?

Additional information :

  • FactsTable1 contains 400000+ rows & 240 columns
  • Its partitioned on date column
  • It is heavily queried for analytic data.
  • @@version : Microsoft SQL Server 2016 (RTM) – 13.0.1601.5 (X64)

Best Answer

So I want to know, does number of columns affect performance of a query for a partitioned table with clustered columnstore index? If yes, then is there any way to measure performance difference?

Yes, the number of columns is often a significant factor for columnstore query performance as shown in shown in the last line your STATISTICS IO output. This is because columnstore allows data for only the columns needed by the query to be scanned. The elapsed time difference (247ms versus 52ms) indicates query performance is proportional to the number of columns (as I would expect), with elapsed time being the primary measure of performance.

Although one can generally use elapsed time as the primary performance indicator, there are other factors to consider. Cached data or other activity on shared resources could affect timings. Both plans are serial in this case so elapsed times are comparable for relative performance, with the assumption of a repeatable test in an isolated environment. However, if one plan were serial and the other parallel, the parallel plan could consume more CPU time than the wall clock time and result in a biased or incorrect conclusion (e.g. the query with more columns runs faster).