SQL Server Index Tuning – Making Index Work on Join to Table Variable

execution-planindexindex-tuningsql servert-sql

I a table with several columns, but the most importants are: DocumentId(bigint), StatusId(int) and ReceiveDateTime(datetimeoffset(3)).

CREATE TABLE [DocumentInfo](
    [DocumentId] [bigint] NOT NULL,
    [StatusId] [int] NOT NULL,
    [ReceiveDateTime] [datetimeoffset](3) NOT NULL,
    [IsDeleting] [bit] NOT NULL,
    [StorageId] [uniqueidentifier] NOT NULL,
 CONSTRAINT [PK_DocumentInfo] PRIMARY KEY CLUSTERED 
(
    [DocumentId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

I also have a table variable which contains several rows (4 for now):

DECLARE @Tresholds TABLE
(
    DocumentStatusId int NOT NULL,
    Treshold datetime2 NOT NULL
);

I used indices on DocumentStatusId (is was a primary clustered key), and unique constraint (Treshold, DocumentStatusId), but query didn't use it, so I removed them.

My query looks like:

WITH TopOutdatedDocuments AS
(
    SELECT TOP (100000) di.DocumentId
    FROM [DocumentInfo] di WITH (NOLOCK)
    JOIN @Tresholds t ON di.StatusId = t.DocumentStatusId
    WHERE (ReceiveDateTime < @DocOldestTreshold OR (ReceiveDateTime < @DocNewestTreshold AND ReceiveDateTime < t.Treshold))
    AND NOT EXISTS (SELECT * FROM [doc].[document] d (NOLOCK) WHERE d.ParentId = di.DocumentId)
) 

The problem that the query plan always use index scan:
enter image description here

I tried to create an index for every permutation of these 3 columns, but it didn't use it anyway. DocOldestTreshold and @DocNewestTreshold are used to speedup a query, becuase they can be computed without a join. But it seems that engine doesn't use this opportunity.

How can I speedup a query? Its logical sense is that we have some tresholds for every StatusId, and if document is older than its treshold, it should be selected by this query.

Any help will be appreciated. I am now on SQL Server 2014. Order of [DocumentInfo]'s size is hundred millions of rows, typical size of @Tresholds is < 10.

Best Answer

Assuming your variables are always NOT NULL

WHERE  ( ReceiveDateTime < @DocOldestTreshold
          OR ( ReceiveDateTime < @DocNewestTreshold
               AND ReceiveDateTime < t.Treshold ) ) 

Is equivalent to

WHERE  ReceiveDateTime < @DocOldestTreshold
        OR ReceiveDateTime < LEAST(@DocNewestTreshold, t.Treshold) 

Which is equivalent to

WHERE  ReceiveDateTime < GREATEST(@DocOldestTreshold, LEAST(@DocNewestTreshold, t.Treshold)) 

So if you have this index

CREATE INDEX ix
  ON [DocumentInfo](StatusId, [ReceiveDateTime])

And the following query

SELECT di.DocumentId
FROM   @Tresholds T
       CROSS APPLY (SELECT IIF(@DocNewestTreshold < t.Treshold, @DocNewestTreshold, t.Treshold)) CA1(Least)
       CROSS APPLY (SELECT IIF(@DocOldestTreshold > CA1.Least, @DocOldestTreshold, CA1.Least)) CA2(Target)
       JOIN [DocumentInfo] di
         ON di.StatusId = t.DocumentStatusId
            AND di.ReceiveDateTime < CA2.Target
            AND NOT EXISTS (SELECT *
                            FROM   [doc].[document] d (NOLOCK)
                            WHERE  d.ParentId = di.DocumentId) 

It can do an equality seek on StatusId and a range seek on ReceiveDateTime