This is type of query I'm trying to run:
WITH CTE_Ordered AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY PartitionField ORDER BY DateField) AS PartitionRowId
FROM SourceTable
),
CTE_Top1_PerPartition AS
(
SELECT *
FROM CTE_Ordered
WHERE PartitionRowId = 1
),
CTE_Calculations AS
(
SELECT AVG(NumberField1) AS NumberField1_Avg, StdDev.StdDev AS NumberField1_StdDev
FROM CTE_Top1_PerPartition
CROSS JOIN
(
SELECT STDEV(NumberField1) AS StdDev
FROM CTE_Top1_PerPartition
) AS StdDev
GROUP BY StdDev.StdDev
)
-- Final Select
SELECT *
FROM CTE_Calculations
Every time I run the final select, my NumberField1_StdDev value changes even though SourceTable is isolated and doesn't change.
I noticed if I select CTE_Top1_PerPartition into a temp table first and then run the rest of the query off of that temp table, then I get the same result for NumberField1_StdDev every time.
I would guess it has something to do with the order the results are returned in the CTE_Top1_PerPartition CTE, but I'm not sure why.
Best Answer
row_number
is not deterministic if there can be ties (i.e. rows with the samePartitionField
andDateField
values). Any of the tied values might end up with aPartitionRowId
of 1 which would presumably change the final result.You could use
rank
instead ofrow_number
but that would cause you to consider all the tied rows which may not be what you want. You could also update your analytic function to add additionalorder by
criteria to ensure thatrow_number
returns a deterministic first row.