SQL Server – Why Primary (Clustered) Key Is Not Used in Query

execution-planoptimizationperformancequery-performancesql serversql-server-2008-r2

I have a SQL Server 2008 R2 table whose schema structure looks as follows:

CREATE TABLE [dbo].[CDSIM_BE]
(
    [ID] [bigint] NOT NULL,
    [EquipmentID] [varchar](50) NOT NULL,
    [SerialNumber] [varchar](50) NULL,
    [PyrID] [varchar](50) NULL,
    [MeasMode] [varchar](50) NULL,
    [ReadTime] [datetime] NOT NULL,
    [SubID] [varchar](15) NULL,
    [ProbePosition] [float] NULL,
    [DataPoint] [int] NULL,

    CONSTRAINT [PK_CDSIM_BE] 
    PRIMARY KEY CLUSTERED ([ID] ASC, [EquipmentID] ASC, [ReadTime] ASC)
         WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
               IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, 
               ALLOW_PAGE_LOCKS = ON) ON [MonthlyArchiveScheme9]([ReadTime])
) ON [MonthlyArchiveScheme9]([ReadTime])

CREATE NONCLUSTERED INDEX [idx_CDSIM_BE__SubID_ProbePosition] 
ON [dbo].[CDSIM_BE] ([SubID] ASC, [ProbePosition] ASC)
INCLUDE ([EquipmentID], [ReadTime], [BECorr]) 
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
          SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, 
          ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [MonthlyArchiveScheme9]([ReadTime])

CREATE NONCLUSTERED INDEX [IX_CDSIM_BE_ProbePosition] 
ON [dbo].[CDSIM_BE] ([ProbePosition] ASC)
INCLUDE ([SerialNumber], [SubID]) 
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
          SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, 
          ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [MonthlyArchiveScheme9]([ReadTime])

CREATE NONCLUSTERED INDEX [IX_CSDIM_Readtime] 
ON [dbo].[CDSIM_BE]([ReadTime] ASC)
INCLUDE ([EquipmentID]) 
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
          SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, 
          ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [MonthlyArchiveScheme9]([ReadTime])

And I'm executing this simple query:

Select Max(Id)
From dbo.CDSIM_BE

There are ~2.5B rows in the table.

The query plan shows an index scan being done on the IX_CdSIM_BE_ProbePosition index. I'm wondering why SQL Server simply wouldn't use the clustered (and primary) index and immediately go to the last row in the table and retrieve the Id value, since that must be the maximum.

Best Answer

The clustered index is partitioned on ReadTime so it couldn't use the PK as you describe. It would need to find the Max(Id) for each partition and then find the max of those. It is possible to rewrite the query to get such a plan however.

Using an example based on the article here a possible rewrite might be

SELECT MAX(ID) AS ID
FROM   sys.partitions AS P
       CROSS APPLY (SELECT MAX(ID) AS ID
                    FROM   [dbo].[CDSIM_BE]
                    WHERE  $PARTITION.MonthlyArchiveFunction9(ReadTime) 
                                                    = P.partition_number) AS A
WHERE  P.object_id = OBJECT_ID('dbo.CDSIM_BE')
       AND P.index_id <= 1; 

To process each partition in turn.

Note the plan still has a scan (with a seek predicate to select the partition) but this is not a full scan of the partition.

The scan is in index order with direction "BACKWARD". The TOP iterator can stop requesting rows from the scan after the first one is received.

enter image description here