Sql-server – Empty result takes longer with all columns included

entity-frameworkperformancequery-performancesql serversql-server-2017

I'm executing a query every 1 second and most of the time it returns no results (SQL generated by Entity Framework):

SELECT TOP (5) 
   [Extent1].[ID] AS [ID],
   ***30 more columns***

   FROM ParentTable AS [Extent1]
   WHERE ([Extent1].[ImageTaken] = 1) AND ([Extent1].[ImageProjected] <> 1) AND ( EXISTS (SELECT 
      1 AS [C1]
      FROM ChildTable AS [Extent2]
      WHERE [Extent1].[ID] = [Extent2].[Parent_ID]
   ))

The above query take about 400ms. However, if I exclude all the columns except ID from the result it takes about 100ms.

Why is the execution time so different if there are no results? I looked at the execution plans and they look identical (I've never looked at an execution plan before today, so take that with a grain of salt).

I'd like to include all the columns, but obviously I only need them if there is a result.

Actual Execution Plans

All Columns

Id Column Only

EDIT
More details:

  • I'm using SQL Server Express 2017
  • The SQL Server is on my local machine
  • The parent table has ~1 million rows
  • The child table has ~40k rows

Parent Table Definition


    USE [DB]
    GO

    /****** Object:  Table [dbo].[Inspection_CapturedImageQueueItem]    Script Date: 1/8/2018 6:23:45 PM ******/
    SET ANSI_NULLS ON
    GO

    SET QUOTED_IDENTIFIER ON
    GO

    CREATE TABLE [dbo].[Inspection_CapturedImageQueueItem](
        [ID] [INT] IDENTITY(1,1) NOT NULL,
        [X] [FLOAT] NOT NULL,
        [Y] [FLOAT] NOT NULL,
        [Z] [FLOAT] NOT NULL,
        [rX] [FLOAT] NOT NULL,
        [rY] [FLOAT] NOT NULL,
        [rZ] [FLOAT] NOT NULL,
        [Priority] [INT] NOT NULL,
        [TimeTakenUTC] [datetime] NOT NULL,
        [ImageTaken] [bit] NOT NULL,
        [PartProgramInstance_Id] [INT] NULL,
        [PolarizerAngle1] [FLOAT] NOT NULL,
        [PolarizerAngle2] [FLOAT] NOT NULL,
        [ImageProjected] [bit] NOT NULL,
        [LaserTransform_X] [FLOAT] NOT NULL,
        [LaserTransform_Y] [FLOAT] NOT NULL,
        [LaserTransform_Z] [FLOAT] NOT NULL,
        [LaserTransform_rX] [FLOAT] NOT NULL,
        [LaserTransform_rY] [FLOAT] NOT NULL,
        [LaserTransform_rZ] [FLOAT] NOT NULL,
        [LaserPosition_X] [FLOAT] NOT NULL,
        [LaserPosition_Y] [FLOAT] NOT NULL,
        [LaserPosition_Z] [FLOAT] NOT NULL,
        [LaserPosition_rX] [FLOAT] NOT NULL,
        [LaserPosition_rY] [FLOAT] NOT NULL,
        [LaserPosition_rZ] [FLOAT] NOT NULL,
        [ProjectionError] [INT] NOT NULL,
        [LocalTransform_X] [FLOAT] NOT NULL,
        [LocalTransform_Y] [FLOAT] NOT NULL,
        [LocalTransform_Z] [FLOAT] NOT NULL,
        [LocalTransform_rX] [FLOAT] NOT NULL,
        [LocalTransform_rY] [FLOAT] NOT NULL,
        [LocalTransform_rZ] [FLOAT] NOT NULL,
        [IsHighAngleOfIncidence] [bit] NOT NULL,
     CONSTRAINT [PK_dbo.Inspection_CapturedImageQueueItem] PRIMARY KEY CLUSTERED
    (
        [ID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [PolarizerAngle1]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [PolarizerAngle2]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [ImageProjected]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LaserTransform_X]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LaserTransform_Y]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LaserTransform_Z]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LaserTransform_rX]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LaserTransform_rY]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LaserTransform_rZ]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LaserPosition_X]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LaserPosition_Y]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LaserPosition_Z]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LaserPosition_rX]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LaserPosition_rY]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LaserPosition_rZ]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [ProjectionError]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LocalTransform_X]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LocalTransform_Y]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LocalTransform_Z]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LocalTransform_rX]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LocalTransform_rY]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [LocalTransform_rZ]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] ADD  DEFAULT ((0)) FOR [IsHighAngleOfIncidence]
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem]  WITH CHECK ADD  CONSTRAINT [FK_dbo.Inspection_CapturedImageQueueItem_dbo.Inspection_PartProgramInstance_PartProgramInstance_Id] FOREIGN KEY([PartProgramInstance_Id])
    REFERENCES [dbo].[Inspection_PartProgramInstance] ([Id])
    GO

    ALTER TABLE [dbo].[Inspection_CapturedImageQueueItem] CHECK CONSTRAINT [FK_dbo.Inspection_CapturedImageQueueItem_dbo.Inspection_PartProgramInstance_PartProgramInstance_Id]
    GO

Child Table Definition


USE [DB]
GO

/****** Object:  Table [dbo].[Inspection_CapturedImageItem]    Script Date: 1/11/2018 9:01:34 AM ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[Inspection_CapturedImageItem](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [PolarizerAngle] [float] NOT NULL,
    [ImageRequest_ID] [int] NULL,
    [TimeTakenUTC] [datetime] NOT NULL,
    [ProjectorNumber] [int] NOT NULL,
    [AngleOfIncidence] [float] NOT NULL,
    [MirrorRx] [float] NOT NULL,
    [MirrorRy] [float] NOT NULL,
 CONSTRAINT [PK_dbo.Inspection_CapturedImageItem] PRIMARY KEY CLUSTERED 
(
    [ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[Inspection_CapturedImageItem] ADD  DEFAULT ('1900-01-01T00:00:00.000') FOR [TimeTakenUTC]
GO

ALTER TABLE [dbo].[Inspection_CapturedImageItem] ADD  DEFAULT ((0)) FOR [ProjectorNumber]
GO

ALTER TABLE [dbo].[Inspection_CapturedImageItem] ADD  DEFAULT ((0)) FOR [AngleOfIncidence]
GO

ALTER TABLE [dbo].[Inspection_CapturedImageItem] ADD  DEFAULT ((0)) FOR [MirrorRx]
GO

ALTER TABLE [dbo].[Inspection_CapturedImageItem] ADD  DEFAULT ((0)) FOR [MirrorRy]
GO

ALTER TABLE [dbo].[Inspection_CapturedImageItem]  WITH CHECK ADD  CONSTRAINT [FK_dbo.Inspection_CapturedImageItem_dbo.Inspection_CapturedImageQueueItem_ImageRequest_ID] FOREIGN KEY([ImageRequest_ID])
REFERENCES [dbo].[Inspection_CapturedImageQueueItem] ([ID])
GO

ALTER TABLE [dbo].[Inspection_CapturedImageItem] CHECK CONSTRAINT [FK_dbo.Inspection_CapturedImageItem_dbo.Inspection_CapturedImageQueueItem_ImageRequest_ID]
GO

Indices

enter image description here

Best Answer

The performance issue that you have with that scan is a mysterious one. In general I would expect to see very similar cpu times for scans that don't return any rows. There aren't any computed columns and the logical read count is identical. Not surprisingly I couldn't reproduce that issue on my machine. Perhaps it's related somehow to Express edition, but that's a wild guess.

I can offer you a workaround to your performance issue. Looking at the actual plan, we can see that the Inspection_CapturedImageItem table only has one distinct value for ImageRequest_ID. The join from that table to Inspection_CapturedImageQueueItem is against the primary key and the clustered key of the table, so a nested loop join plan with Inspection_CapturedImageItem as the outer table would likely be very efficient. It would certainly be more efficient than scanning the entire Inspection_CapturedImageItem table.

So why does the query optimizer pick a merge join instead? The clustered index scan cost is reduced by the row goal that's introduced from the TOP (5) part of the query. In fact, with a bit of math I can give you an approximate cardinality estimate for the [Extent1].[ImageTaken] = 1) AND ([Extent1].[ImageProjected] <> 1) filters:

scan cost in optimizer units = 0.0031895 + LEAST(1, ROW_GOAL / CARDINALITY_ESTIMATE) * (FULL_SCAN_COST – 0.0031895)

0.0031895 + ( 5 / CARDINALITY_ESTIMATE) * (23.4928 + 1.11635 – 0.0031895) = 0.0453535

CARDINALITY_ESTIMATE = (5 * (23.4928 + 1.11635 – 0.0031895)) / (0.0453535 - 0.0031895)

CARDINALITY_ESTIMATE = 2918

SQL Server thinks that there are about 2918 rows that match your filters from the Inspection_CapturedImageQueueItem table. Your query only needs the first five results. The optimizer assumes that it won't have to scan the entire table in order to those five rows. After all, there are 2918 of them in the table. Unfortunately, here you are filtering on predicates for which there are no matching rows in the table. This is the worst possible case for row goals. You pay the full cost of the scan but the query optimizer creates a plan assuming that you only need to scan 0.3% of the table.

There are two ways to solve this problem: you can give the optimizer better information or you can force the optimizer do use a plan that you think is better. The first option is generally preferred when possible. There aren't enough details to dig into it here, but you might benefit from statistics updates on the ImageTaken and ImageProjected columns. Updating statistics on theImageRequest_ID column of Inspection_CapturedImageItem could be beneficial as well. The query optimizer estimates that there are 42 distinct rows but there's just one.

If you're confident that you understand why you're getting a bad plan and that the data won't change over time you could consider a query hint. The use hint for disabling row goals, HINT('DISABLE_OPTIMIZER_ROWGOAL'), may be a good choice. With that hint you are instructing the query optimizer not to optimize the query to return the first five rows as quickly as possible. Instead, a plan will be created to efficiently return all rows in the result set. Even with the poor cardinality estimate on Inspection_CapturedImageItem you're likely to end up with a nested loop join that should be faster than the query that does the scan.