Say I have a partitioned table, set up like so:
CREATE PARTITION FUNCTION PF_Month (DATE) AS RANGE RIGHT FOR VALUES (
'2017-01-01',
'2017-02-01',
'2017-03-01',
'2017-04-01',
'2017-05-01',
'2017-06-01',
);
GO
CREATE PARTITION SCHEME PS_Month AS PARTITION PF_Month ALL TO ([Primary]);
GO
CREATE TABLE Logs
(
Id INT NOT NULL,
DateRecorded DATE NOT NULL,
FixStatus INT NOT NULL
);
GO
ALTER TABLE Logs
ADD CONSTRAINT PK_Logs PRIMARY KEY (Id, DateRecorded)
ON PS_Month(DateRecorded);
GO
CREATE NONCLUSTERED INDEX [IX_DateRecorded] ON Logs(DateRecorded)
INCLUDE(FixStatus)
ON PS_Month(DateRecorded);
GO
If I want to query the logs, in order of date, I have been told I can use the partition number in order to avoid a sort when the results from each partition are joined back together.
SELECT * FROM Logs WHERE ... ORDER BY $PARTITION.PF_Month(DateRecorded), DateRecorded
Are the partition numbers in order of when each partition was created, or are they in order of DateRecorded
? E.g. if I were to add another split to the function while it is in use, would ordering by partition number still work?
ALTER PARTITION FUNCTION PF_Month() SPLIT RANGE ('2016-12-01')
Best Answer
I wasn't able to find a formal guarantee of this behavior, but I did find multiple examples — including a modified version of your original example — in which query optimization decisions appear to be made based on a guarantee that partition numbers are in order by value. Here's what I found:
Your example
Your example code doesn't currently compile, and also uses an index that does not cover the sample query. Here's an updated script that contains working code for your original example.
With these updates, we can see that the query plan for your ORDER BY query does not include a sort operator; query optimization appears to rely on the fact that partition numbers are ordered by value in order to omit the extra sort.
Documentation
There is strong evidence in the documentation that partition numbers are in order by value, but I agree that the documentation itself is not conclusive. Here are a few pieces of evidence that I found:
The CREATE PARTITION FUNCTION documentation states that the partitions will initially be in order:
The sys.partition_range_values documentation states that
boundary_id
, which appears to be used to define the partition number, is anThe $PARTITION documentation stats that
Further Testing
With a bit of testing, we can see that there are multiple cases in which SQL Server makes query plan decisions that appear to rely on the fact that partition numbers are in order by value.
Here's the full test script, and here are a few of the most relevant queries: