SQL Server – Count Aggregate Not Respecting Pivot Columns

pivotsql serversql-server-2016t-sql

In the pivot query below, why is every cell in a given row identical?

I was able to fix the query by replacing count(voo) with sum(voo), but I don't understand why count(voo) does not respect the AppVersion column. I tried to follow Microsoft's Complex PIVOT Example, which also used Count.

select pv.EventDate, [VersionAA],[VersionBB],[VersionCC],[VersionDD]
from 
(
    select ah.EventDate,ah.AppVersion,1 voo
    from AppHistory ah
    where EventDate > '2018-02-04' and reason = 1
) source
PIVOT(
    count(voo) --Fix: Replace "count(voo)" with "sum(voo)"
    For AppVersion in ([VersionAA],[VersionBB],[VersionCC],[VersionDD])
) as pv
order by EventDate

Example output with count(voo):

+------------+-----------+-----------+-----------+-----------+
| EventDate  | VersionAA | VersionBB | VersionCC | VersionDD |
+------------+-----------+-----------+-----------+-----------+
| 2018-02-05 |      1315 |      1315 |      1315 |      1315 |
| 2018-02-06 |      1200 |      1200 |      1200 |      1200 |
+------------+-----------+-----------+-----------+-----------+

Corrected output with sum(voo):

+------------+-----------+-----------+-----------+-----------+
| EventDate  | VersionAA | VersionBB | VersionCC | VersionDD |
+------------+-----------+-----------+-----------+-----------+
| 2018-02-05 |        12 |        18 |       300 |         3 |
| 2018-02-06 |        22 |        30 |       310 |         5 |
+------------+-----------+-----------+-----------+-----------+

My intent, which was met by sum(voo), was to to create a query which returned:
1) One row per day
2) One column per AppVersion (happily, the list of AppVersions is static)
3) Each cell contains a count of how many rows within AppHistory match the date and AppVersion of the column.

For reference AppHistory looks something like this:

CREATE TABLE [dbo].[AppHistory](
    [UserID] [bigint] NOT NULL,
    [AppVersion] [varchar](100) NOT NULL,
    [Reason] [int] NOT NULL,
    [EventDate] [datetime] NOT NULL,
    [TotalActions] [int] NOT NULL,
    [created] [datetime] NOT NULL
)

EDIT

I tried to reproduce the problem using a small amount of data and failed. In fact, the behavior goes away if I run select * into #AppHistory from AppHistory where EventDate > '2018-02-04' and reason = 1 and then replace AppHistory with #AppHistory in my queries. So now I'm even more confused. CheckDB says the database is fine.

I honestly didn't think of this until now, but AppHistory is a bit crazy:
A) Has almost 1 billion rows
B) Has a partition scheme
C) Has a clustered columnstore index.

That said, creating a new table with the same partition scheme and columnstore index also failed to reproduce this behavior. So, now I'm even more mystified.

Query Plans:
(Bad Result)Plan using Count
(Good Result)Plan using Sum

Best Answer

Looks to me like a bug when producing the query plan when you have batch mode execution.

Here is a repro that shows the issue both in SQL Server 2016 (13.0.4001.0) and SQL Server 2017 (14.0.3015.40).

create table dbo.CS(CS bit not null, index IX_CS clustered columnstore);
create table dbo.X(C int not null);

go

insert into dbo.X(C)
select top(10000) row_number() over(order by (select null)) / 10
from sys.columns as c1, sys.columns as c2;

go

select P.[1], P.[2]
from (
     select X.C, 1 as X
     from dbo.X
     ) as T
pivot (
      count(T.X) for T.C in ([1],[2])
      ) as P
left outer join dbo.CS on 1 = 0;

select P.[1], P.[2]
from (
     select X.C, 1 as X
     from dbo.X
     ) as T
pivot (
      count(T.X) for T.C in ([1],[2])
      ) as P
left outer join dbo.CS on 1 = 0
option (querytraceon 9453);

Result:

1           2
----------- -----------
10000       10000


1           2
----------- -----------
10          10

Left join on dbo.CS here makes the query use batch mode and the trace flag 9453 disables batch mode.