SQL Server Partitioning – Are Partition Numbers Ordered by Value?

partitioningsql server

Say I have a partitioned table, set up like so:

CREATE PARTITION FUNCTION PF_Month (DATE) AS RANGE RIGHT FOR VALUES (
  '2017-01-01',
  '2017-02-01',
  '2017-03-01',
  '2017-04-01',
  '2017-05-01',
  '2017-06-01',
);
GO

CREATE PARTITION SCHEME PS_Month AS PARTITION PF_Month ALL TO ([Primary]);
GO

CREATE TABLE Logs
(
     Id           INT NOT NULL,
     DateRecorded DATE NOT NULL,
     FixStatus    INT NOT NULL
);
GO

ALTER TABLE Logs
    ADD CONSTRAINT PK_Logs PRIMARY KEY (Id, DateRecorded)
    ON PS_Month(DateRecorded);
GO

CREATE NONCLUSTERED INDEX [IX_DateRecorded] ON Logs(DateRecorded)
    INCLUDE(FixStatus)
    ON PS_Month(DateRecorded);
GO

If I want to query the logs, in order of date, I have been told I can use the partition number in order to avoid a sort when the results from each partition are joined back together.

SELECT * FROM Logs WHERE ... ORDER BY $PARTITION.PF_Month(DateRecorded), DateRecorded

Are the partition numbers in order of when each partition was created, or are they in order of DateRecorded? E.g. if I were to add another split to the function while it is in use, would ordering by partition number still work?

ALTER PARTITION FUNCTION PF_Month() SPLIT RANGE ('2016-12-01')

Best Answer

I wasn't able to find a formal guarantee of this behavior, but I did find multiple examples — including a modified version of your original example — in which query optimization decisions appear to be made based on a guarantee that partition numbers are in order by value. Here's what I found:

Your example

Your example code doesn't currently compile, and also uses an index that does not cover the sample query. Here's an updated script that contains working code for your original example.

With these updates, we can see that the query plan for your ORDER BY query does not include a sort operator; query optimization appears to rely on the fact that partition numbers are ordered by value in order to omit the extra sort.

-- Create a partition-aligned index that includes all
-- columns in your SELECT statements
CREATE NONCLUSTERED INDEX [IX_DateRecorded] ON Logs(DateRecorded)
     INCLUDE (FixStatus) ON PS_Month(DateRecorded)
GO

-- Now that your index is properly defined, this query produces
-- an ordered index scan without a sort
SELECT * FROM Logs ORDER BY DateRecorded
GO

Documentation

There is strong evidence in the documentation that partition numbers are in order by value, but I agree that the documentation itself is not conclusive. Here are a few pieces of evidence that I found:

The CREATE PARTITION FUNCTION documentation states that the partitions will initially be in order:

If the values are not in order, the Database Engine sorts them, creates the function, and returns a warning that the values are not provided in order.

The sys.partition_range_values documentation states that boundary_id, which appears to be used to define the partition number, is an

ID (1-based ordinal) of the boundary value tuple, with left-most boundary starting at an ID of 1.

The $PARTITION documentation stats that

$PARTITION returns an int value between 1 and the number of partitions of the partition function.

Further Testing

With a bit of testing, we can see that there are multiple cases in which SQL Server makes query plan decisions that appear to rely on the fact that partition numbers are in order by value.

Here's the full test script, and here are a few of the most relevant queries:

-- The predicate x BETWEEN 4 AND 7 is converted to a range seek
-- on the partition number, strongly indicating that there is a
-- guarantee that partition numbers are in order by value.
-- Seek Keys[1]:
--      Start: PtnId1001 >= Scalar Operator(...[@1]...),
--      End: PtnId1001 <= Scalar Operator(...[@2]...)
SELECT COUNT(*)
FROM dbo.test_partition_ranges
WHERE x BETWEEN 4 AND 7
GO

-- If we change our predicate to use the partition number directly,
-- we see the same seek predicate.
-- Seek Keys[1]:
--      Start: PtnId1001 >= Scalar Operator(...[@1]...),
--      End: PtnId1001 <= Scalar Operator(...[@2]...)
SELECT COUNT(*)
FROM dbo.test_partition_ranges
WHERE $PARTITION.PF_INT_1to10(x) BETWEEN 4 AND 7
GO

-- If we mix conflicting predicates that use values and partition numbers,
-- SQL Server knows that no partitions are eligible.
-- "Actual Partition Count" is 0 and there are 0 logical reads
SELECT COUNT(*)
FROM dbo.test_partition_ranges
WHERE $PARTITION.PF_INT_1to10(x) > 5
    AND x <= 5
GO

-- All 10 partitions are accessed, and ordering by the partition column
-- does not produce a sort! It appears that SQL Server is once again
-- relying on the partition numbers being in order by value.
SELECT x
FROM dbo.test_partition_ranges
ORDER BY x
GO