Sql-server – Query runs quicker when column calculation is done after initial query

query-performancesql server

I have always been led to believe that since SQL evals the SELECT after filtering that putting a calculation in the select of a query should be as fast or faster than running the query and then doing the calculation over the result. So I assume I am doing something wrong in the following query. The JOINS and the WHERE are the same in both, in the first the calculation is done after the result returns (takes about 3 seconds). The second query has the same calculation in the initial query (takes about 1 minute). I have added the query plan. Running on SQL 2008.

  DECLARE @MsgTable TABLE (
    [Type] VARCHAR(64),
    [EventDate] [datetime],
    [LicenseId] bigint,
    [RequestedOn] [datetime]
  );

  -- Calculate AFTER 
  INSERT INTO @MsgTable
  SELECT
    Type,
    EventDate,
    EventLog.LicenseId,
    NULL
  FROM EventLog
    INNER JOIN #LTable L ON (EventLog.LicenseId = L.LicenseId)
    INNER JOIN LicenseConfig LC ON LC.LicenseId = EventLog.LicenseId AND ((LC.ConfigKey = 'ConversionStatus'))
  WHERE 
    (EventLog.Type IN ('ERPValidationRequest', 'ERPMappingComments'))
    AND 
    (EventLog.EventDate >= @StartDate)
    AND 
    (EventLog.EventDate < @EndDate)
  ;

  SELECT 
    Type,
    EventDate,
    LicenseId,
    (
      SELECT TOP 1 ROEventLog.EventDate 
      FROM EventLog ROEventLog
      WHERE 
        ROEventLog.LicenseId = MT.LicenseId
        AND
        ROEventLog.Type = 'ERPMapRequest' 
        AND
        ROEventLog.EventDate < MT.EventDate
      ORDER BY ROEventLog.EventDate DESC
    ) RequestedOn
  FROM @MsgTable MT;

  DELETE FROM @MsgTable;

  -- Calculate during
  INSERT INTO @MsgTable
    SELECT
      Type,
      EventDate,
      EventLog.LicenseId,
      (
        SELECT TOP 1 ROEventLog.EventDate 
        FROM EventLog ROEventLog
        WHERE 
          ROEventLog.LicenseId = EventLog.LicenseId
          AND
          ROEventLog.Type = 'ERPMapRequest' 
          AND
          ROEventLog.EventDate < EventLog.EventDate
        ORDER BY ROEventLog.EventDate DESC
      ) RequestedOn
    FROM EventLog
      INNER JOIN #LTable L ON (EventLog.LicenseId = L.LicenseId)
      INNER JOIN LicenseConfig LC ON LC.LicenseId = EventLog.LicenseId AND ((LC.ConfigKey = 'ConversionStatus'))
    WHERE 
      (EventLog.Type IN ('ERPValidationRequest', 'ERPMappingComments'))
      AND 
      (EventLog.EventDate >= @StartDate)
      AND 
      (EventLog.EventDate < @EndDate)
  ;

  SELECT * FROM @MsgTable;

enter image description here

Best Answer

In query 4, the one-row from the table variable convinces the optimizer to choose an index seek and key lookup on the EventLog table. The true number of rows returned is 49:

Table variable scan

For query 6, the optimizer produces an over-estimate (22,659 vs 49):

22,659 estimate

The higher number of estimated executions causes the optimizer to choose to build a temporary index (Eager Index Spool) on the EventLog table instead of performing 22,659 key lookups. This temporary index is built and dropped on each execution of the query. The result is a lower estimated cost plan, but it runs much longer in practice.

Adjusting the indexing of the EventLog table is one way to resolve this issue.

For a more detailed answer, please provide the actual (post-execution) plans in xml form e.g. via https://www.brentozar.com/pastetheplan/

Related Question