Sql-server – DATEADD not producing a SARGable expectation of an index seek

execution-planindexperformancequery-performancesql server

I have a basic [UserActivity] table that captures an ActivityTypeId per UserId and the ActivityDate at which the Activity occurred.

I am writing a query/stored procedure that allows for input of the @UserId, @ForTypeId, as well as the @DurationInterval and @DurationIncrement to dynamically return results based on N number of seconds/minutes/hours/days/months/years. Given that the datepart argument within DATEADD/DATEDIFF does not allow parameters, I had to revert to a bit of trickery in order to get the desired results within the WHERE clause.

Initially I wrote the query using DATEDIFF, but immediately after writing and taking a peek at the execution plan, I remembered that it is not a SARGable function (along with the fact the precision levels could offer for some dates falling off in a Leap Year). So, I re-wrote the query to utilize DATEPART thinking that I would hit an index seek instead of an index scan and generally perform better.

Unfortunately, I've found that writing the query as DATEADD offers up the same results: an index scan is occurring, and query optimizer is not leveraging the non-clustered index against [ActivityDate].

I read Aaron Bertrand's blog post, "Performance Surprises and Assumptions: DATEADD", and implemented the changes he described to CONVERT the DATEADD portion into the equivalent datetime2 column definition due to weird trickery involved with datetime2. However, the issue was still present even after doing so.

To better illustrate the scenario, here is a comparable table definition.

DROP TABLE IF EXISTS [dbo].[UserActivity]
IF OBJECT_ID('[dbo].[UserActivity]', 'U') IS NULL
BEGIN
    CREATE TABLE [dbo].[UserActivity] (
        [UserId] [int] NOT NULL
        ,[UserActivityId] [bigint] IDENTITY(1,1) NOT NULL
        ,[ActivityTypeId] [tinyint] NOT NULL
        ,[ActivityDate] [datetime2](0) NOT NULL CONSTRAINT [DF_UserActivity_ActivityDate] DEFAULT GETDATE()
        ,CONSTRAINT [PK_UserActivity] PRIMARY KEY CLUSTERED ([UserActivityId] ASC)
        ,INDEX [IX_UserActivity_UserId] NONCLUSTERED ([UserId] ASC)
        ,INDEX [IX_UserActivity_ActivityTypeId] NONCLUSTERED ([ActivityTypeId] ASC)
        ,INDEX [IX_UserActivity_ActivityDate] NONCLUSTERED ([ActivityDate] ASC)
    )
END;
GO

Populate the table with dummy data recursively for 5 different users with a random ActivityTypeId between 1 and 10 with a new ActivityDate every 4 minutes.

DECLARE @UserId int = (SELECT ISNULL((SELECT TOP (1) [UserId] + 1 FROM [dbo].[UserActivity] ORDER BY [UserId] DESC), 1))
;WITH [UserActivitySeed] AS (
    SELECT
        CONVERT(datetime2(0), '01/01/2018') AS 'ActivityDate'
    UNION ALL
    SELECT
        DATEADD(minute, 4, [ActivityDate])
    FROM
        [UserActivitySeed]
    WHERE
        [ActivityDate] < '2018-04-01')
INSERT INTO [dbo].[UserActivity] ([UserId], [ActivityTypeId], [ActivityDate])
SELECT
    @UserId
    ,ABS(CHECKSUM(NEWID()) % 9) + 1
    ,[ActivityDate]
FROM
    [UserActivitySeed] OPTION (MAXRECURSION 32767);

GO 5

ALTER INDEX ALL ON [dbo].[UserActivity] REBUILD;

Below is the first query I wrote with DATEDIFF. Note I am excluding the @UserId and @ForTypeId predicates intentionally so to avoid those key lookups and reduce noise within the plans attached.

As you'll find on PasteThePlan for this query, it is performing an index scan as expected given that DATEDIFF is not SARGable.

DECLARE @UserId int = 1
DECLARE @ForTypeId int = 3
DECLARE @DurationInterval varchar(6) = 'hour'
DECLARE @DurationIncrement int = 1

SELECT
    COUNT(UA.[UserActivityId]) AS 'ActivityTypeCount'
FROM
    [dbo].[UserActivity] UA
WHERE
    -- Exclude the @UserId and @ForTypeId predicates.
    -- UA.[UserId] = @UserId
    -- AND UA.[ActivityTypeId] = @ForTypeId
    -- AND 
    CASE
        WHEN @DurationInterval IN ('year', 'yy', 'yyyy') THEN DATEDIFF(SECOND, UA.[ActivityDate], GETDATE()) / 3600.0 / 24.0 / 365.25
        WHEN @DurationInterval IN ('month', 'mm', 'm') THEN DATEDIFF(SECOND, UA.[ActivityDate], GETDATE()) / 3600.0 / 24.0 / 365.25 * 12
        WHEN @DurationInterval IN ('day', 'dd', 'd') THEN DATEDIFF(SECOND, UA.[ActivityDate], GETDATE()) / 3600.0 / 24.0
        WHEN @DurationInterval IN ('hour', 'hh') THEN DATEDIFF(SECOND, UA.[ActivityDate], GETDATE()) / 3600.0
        WHEN @DurationInterval IN ('minute', 'mi', 'n') THEN DATEDIFF(SECOND, UA.[ActivityDate], GETDATE()) / 60.0
        WHEN @DurationInterval IN ('second', 'ss', 's') THEN DATEDIFF(SECOND, UA.[ActivityDate], GETDATE())
    END < @DurationIncrement

Below is the DATEADD query. PasteThePlan here. Unfortunately, an index seek is not occurring. This may be an incorrect assumption on my part, but I'm perplexed as to why it isn't occurring at all.

DECLARE @UserId int = 1
DECLARE @ForTypeId int = 3
DECLARE @DurationInterval varchar(6) = 'hour'
DECLARE @DurationIncrement int = 1

SELECT
    COUNT(UA.[UserActivityId]) AS 'ActivityTypeCount'
FROM
    [dbo].[UserActivity] UA
WHERE
    -- Exclude the @UserId and @ForTypeId predicates.
    -- UA.[UserId] = @UserId
    -- AND UA.[ActivityTypeId] = @ForTypeId
    -- AND 
    (
        (@DurationInterval IN ('year', 'yy', 'yyyy') AND UA.[ActivityDate] > CONVERT(datetime2(0), DATEADD(YEAR, -@DurationIncrement, GETDATE())))
        OR
        (@DurationInterval IN ('month', 'mm', 'm') AND UA.[ActivityDate] > CONVERT(datetime2(0), DATEADD(MONTH, -@DurationIncrement, GETDATE())))
        OR
        (@DurationInterval IN ('day', 'dd', 'd') AND UA.[ActivityDate] > CONVERT(datetime2(0), DATEADD(DAY, -@DurationIncrement, GETDATE())))
        OR
        (@DurationInterval IN ('hour', 'hh') AND UA.[ActivityDate] > CONVERT(datetime2(0), DATEADD(HOUR, -@DurationIncrement, GETDATE())))
        OR
        (@DurationInterval IN ('minute', 'mi', 'n') AND UA.[ActivityDate] > CONVERT(datetime2(0), DATEADD(MINUTE, -@DurationIncrement, GETDATE())))
        OR
        (@DurationInterval IN ('second', 'ss', 's') AND UA.[ActivityDate] > CONVERT(datetime2(0), DATEADD(SECOND, -@DurationIncrement, GETDATE())))
        )

What is the cause of this? Is the behavior I'm seeing a result of my usage of OR negating any potential for it to even get to using the index? Am I overlooking something painstakingly obvious here?

UPDATE: My second question above lead me to perform a query foregoing the OR operations. The query performed the index seek, so something is occurring during these comparisons that SQL Server does not like. PasteThePlan here.

DECLARE @DurationIncrement int = 1

SELECT
    COUNT(UA.[UserActivityId]) AS 'ActivityTypeCount'
FROM
    [dbo].[UserActivity] UA
WHERE
    UA.[ActivityDate] > CONVERT(datetime2(0), DATEADD(HOUR, -@DurationIncrement, GETDATE()))

UPDATE: Solution shared here.

Best Answer

The OR condition evaluates at compile-time, rather than at runtime, which means that your WHERE condition does not generate a seek.

And just to clean up the code, I refactored out your CONVERT to make the code a bit more readable.

I would try changing the WHERE clause to:

UA.[ActivityDate]>CONVERT(datetime2(0), (CASE
    WHEN @DurationInterval IN ('year', 'yy', 'yyyy') THEN DATEADD(year, -@DurationIncrement, GETDATE())
    WHEN @DurationInterval IN ('month', 'mm', 'm')   THEN DATEADD(month, -@DurationIncrement, GETDATE())
    WHEN ...
    END))

I don't have access to an environment where I can verify this, but please let me know if it works out.