If I let the server decide which index to use, it picks IX_MachineryId
, and it takes up to a minute.
That index is not partitioned, so the optimizer recognizes it can be used to provide the ordering specified in the query without sorting. As a non-unique nonclustered index, it also has the keys of the clustered index as subkeys, so the index can be used to seek on MachineryId
and the DateRecorded
range:
The index does not include OperationalSeconds
, so the plan has to look that value up per row in the (partitioned) clustered index in order to test OperationalSeconds > 0
:
The optimizer estimates that one row will need to be read from the nonclustered index and looked up to satisfy the TOP (1)
. This calculation is based on the row goal (find one row quickly), and assumes a uniform distribution of values.
From the actual plan, we can see the estimate of 1 row is inaccurate. In fact, 19,039 rows have to be processed to discover that no rows satisfy the query conditions. This is the worst case for a row goal optimization (1 row estimated, all rows actually needed):
You can disable row goals with trace flag 4138. This would most likely result in SQL Server choosing a different plan, possibly the one you forced. In any case, the index IX_MachineryId
could be made more optimal by including OperationalSeconds
.
It is quite unusual to have non-aligned nonclustered indexes (indexes partitioned in a different way from the base table, including not at all).
That really suggests to me that I have made the index right, and the server is just making a bad decision. Why?
As usual, the optimizer is selecting the cheapest plan it considers.
The estimated cost of the IX_MachineryId
plan is 0.01 cost units, based on the (incorrect) row goal assumption that one row will be tested and returned.
The estimated cost of the IX_MachineryId_DateRecorded
plan is much higher, at 0.27 units, mostly because it expects to read 5,515 rows from the index, sort them, and return the one that sorts lowest (by DateRecorded
):
This index is partitioned, and cannot return rows in DateRecorded
order directly (see later). It can seek on MachineryId
and the DateRecorded
range within each partition, but a Sort is required:
If this index were not partitioned, a sort would not be required, and it would be very similar to the other (unpartitioned) index with the extra included column. An unpartitioned filtered index would be slightly more efficient still.
You should update the source query so that the data types of the @From
and @To
parameters match the DateRecorded
column (datetime
). At the moment, SQL Server is computing a dynamic range due to the type mismatch at runtime (using the Merge Interval operator and its subtree):
<ScalarOperator ScalarString="GetRangeWithMismatchedTypes([@From],NULL,(22))">
<ScalarOperator ScalarString="GetRangeWithMismatchedTypes([@To],NULL,(22))">
This conversion prevents the optimizer from reasoning correctly about the relationship between ascending partition IDs (covering a range of DateRecorded
values in ascending order) and the inequality predicates on DateRecorded
.
The partition ID is an implicit leading key for a partitioned index. Normally, the optimizer can see that ordering by partition ID (where ascending IDs map to ascending, disjoint values of DateRecorded
) then DateRecorded
is the same as ordering by DateRecorded
alone (given that MachineryID
is constant). This chain of reasoning is broken by the type conversion.
Demo
A simple partitioned table and index:
CREATE PARTITION FUNCTION PF (datetime)
AS RANGE LEFT FOR VALUES ('20160101', '20160201', '20160301');
CREATE PARTITION SCHEME PS AS PARTITION PF ALL TO ([PRIMARY]);
CREATE TABLE dbo.T (c1 integer NOT NULL, c2 datetime NOT NULL) ON PS (c2);
CREATE INDEX i ON dbo.T (c1, c2) ON PS (c2);
INSERT dbo.T (c1, c2)
VALUES (1, '20160101'), (1, '20160201'), (1, '20160301');
Query with matched types
-- Types match (datetime)
DECLARE
@From datetime = '20010101',
@To datetime = '20090101';
-- Seek with no sort
SELECT T2.c2
FROM dbo.T AS T2
WHERE T2.c1 = 1
AND T2.c2 >= @From
AND T2.c2 < @To
ORDER BY
T2.c2;
Query with mismatched types
-- Mismatched types (datetime2 vs datetime)
DECLARE
@From datetime2 = '20010101',
@To datetime2 = '20090101';
-- Merge Interval and Sort
SELECT T2.c2
FROM dbo.T AS T2
WHERE T2.c1 = 1
AND T2.c2 >= @From
AND T2.c2 < @To
ORDER BY
T2.c2;
Best Answer
Your query accesses 10 partitions and you are searching a 10 month range so my guess would be that it is partitioned on month of
DateRecorded
.I can reproduce your plan with the sort with the below.
but technically a sort could be avoided if you could get a plan that processed the partitions in order and just concatenated one ordered result to the next.
If you are happy to assume that the partition numbers will be in order of value (I don't know if this is actually guaranteed but it seems to be the case even after partition splits) then adding a leading column to the sort of the partition number achieves this