SQL Server Performance – Predicate Clause vs Single Field

sql serversql-server-2012t-sql

We have been working on an application that utilizes a view with several tables in our database. This view is utilized throughout the application, and although significantly more complicated than the example I've posted below, most queries run relatively quickly. However, we received reports from users that one query in particular takes an inordinate amount of time. The query currently in use is similar to the following sample.

The view:

CREATE VIEW [dbo].[LineItemsForReview]
AS
SELECT        reviewer.Id AS UserId, reviewer.FirstName + ' ' + reviewer.LastName AS UserFullName, ali.Id, ali.PurchaseId, ali.PurchaseTypeId, ali.CreatedDate, ali.ModifiedDate, 
                         ali.BudgetFY, ali.Fund, ali.BudgetActivity, ali.DocSource, ali.DocumentNumber, ali.DocumentDate, ali.VendorName, ali.ObligatedAmt, ali.ReceivedAmt, ali.BilledAmt, 
                         ali.LastTransactionDate, ali.IsPriority
FROM            dbo.AccountingLineItem AS ali LEFT OUTER JOIN
                         dbo.Assignment AS a ON ali.Id = a.LineItemId LEFT OUTER JOIN
                         Auth.[User] AS reviewer ON a.AssignedReviewerUserId = reviewer.Id LEFT OUTER JOIN
                         Auth.[User] AS assignor ON a.AssignorUserId = assignor.Id

The query currently in use:

SELECT [UserId]
      ,[UserFullName]
      ,[Id]
      ,[PurchaseId]
      ,[PurchaseTypeId]
      ,[CreatedDate]
      ,[ModifiedDate]
      ,[BudgetFY]
      ,[Fund]
      ,[BudgetActivity]
      ,[DocSource]
      ,[DocumentNumber]
      ,[DocumentDate]
      ,[VendorName]
      ,[ObligatedAmt]
      ,[ReceivedAmt]
      ,[BilledAmt]
      ,[LastTransactionDate]
      ,[IsPriority]
  FROM [dbo].[LineItemsForReview]
  WHERE DocSource = 'Travel' 
and UserId is null

As you can see from the above query, we are trying to find accounting line items that have not been assigned for review. When we run this query on our production database (which is more complicated than the example I'm providing), the results return in approximately 3 minutes, 23 seconds. However, when we use the following query, we can get those same results back in 4 seconds.

SELECT [UserId]
      ,[UserFullName]
      ,[Id]
      ,[PurchaseId]
      ,[PurchaseTypeId]
      ,[CreatedDate]
      ,[ModifiedDate]
      ,[BudgetFY]
      ,[Fund]
      ,[BudgetActivity]
      ,[DocSource]
      ,[DocumentNumber]
      ,[DocumentDate]
      ,[VendorName]
      ,[ObligatedAmt]
      ,[ReceivedAmt]
      ,[BilledAmt]
      ,[LastTransactionDate]
      ,[IsPriority]
  FROM [dbo].[LineItemsForReview]
  WHERE DocSource = 'Travel' 
    and UserFullName is null

As you can see when looking at the select statement and the view definition, the change to the predicate now utilizes a concatenated string instead of a single field of int data type. This potential improvement is great news for both our technical team and our end users, but I'd like to understand why this particular change decreases the query time before releasing the change to production.

I have discussed this issue with our data architect and other developers and have not come up with an answer. I've also examined the execution plans on both queries – the largest difference between the two is that the slow query (that looks at the single int field) performs a sort on the column PurchaseId, whereas the query using the concatenated field does not perform that sort.

Question: why is the query faster when searching on a concatenated string field instead of a single field of int data type?

Tables and View from Example:

/****** Object:  Table [Auth].[AdminCodeAccess]    Script Date: 2/27/2017 12:29:51 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [Auth].[AdminCodeAccess](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [OrgRepId] [int] NULL,
    [UserRoleId] [int] NULL,
    [ActivationDate] [datetime] NULL,
    [DeactivationDate] [datetime] NULL,
    [IsPrimary] [bit] NULL,
PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO
/****** Object:  Table [Auth].[Role]    Script Date: 2/27/2017 12:29:51 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [Auth].[Role](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [RoleName] [varchar](50) NOT NULL,
    [ActivationDate] [datetime] NOT NULL,
    [DeactivationDate] [datetime] NULL,
PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO
SET ANSI_PADDING OFF
GO
/****** Object:  Table [Auth].[User]    Script Date: 2/27/2017 12:29:51 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [Auth].[User](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [FirstName] [varchar](50) NULL,
    [LastName] [varchar](50) NULL,
    [NetworkId] [varchar](50) NULL,
    [ExpirationDate] [datetime] NULL,
    [Email] [varchar](250) NULL,
    [NetworkOrg] [varchar](10) NULL,
PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO
SET ANSI_PADDING OFF
GO
/****** Object:  Table [Auth].[UserRoles]    Script Date: 2/27/2017 12:29:51 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [Auth].[UserRoles](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [UserId] [int] NOT NULL,
    [RoleId] [int] NOT NULL,
    [ActivationDate] [datetime] NOT NULL,
    [DeactivationDate] [datetime] NULL,
PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO
/****** Object:  Table [dbo].[AccountingLineItem]    Script Date: 2/27/2017 12:29:51 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[AccountingLineItem](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [PurchaseId] [int] NOT NULL,
    [PurchaseTypeId] [int] NULL,
    [CreatedDate] [datetime] NULL,
    [ModifiedDate] [datetime] NULL,
    [BudgetFY] [int] NULL,
    [AccountNumber] [varchar](10) NULL,
    [Fund] [varchar](30) NULL,
    [BudgetActivity] [varchar](15) NULL,
    [DocSource] [varchar](50) NULL,
    [DocumentNumber] [varchar](50) NULL,
    [DocumentDate] [datetime] NULL,
    [VendorName] [varchar](100) NULL,
    [OrgId] [int] NULL,
    [ObligatedAmt] [decimal](19, 4) NULL,
    [ReceivedAmt] [decimal](19, 4) NULL,
    [BilledAmt] [decimal](19, 4) NULL,
    [LastTransactionDate] [datetime] NULL,
    [IsPriority] [bit] NOT NULL,
PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO
SET ANSI_PADDING OFF
GO
/****** Object:  Table [dbo].[Assignment]    Script Date: 2/27/2017 12:29:51 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Assignment](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [LineItemId] [int] NOT NULL,
    [AssignedReviewerUserId] [int] NULL,
    [AssignorUserId] [int] NULL,
    [AssignmentDate] [datetime] NULL,
    [DeactivationDate] [datetime] NULL,
PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO
/****** Object:  View [dbo].[LineItemsForReview]    Script Date: 2/27/2017 12:29:51 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE VIEW [dbo].[LineItemsForReview]
AS
SELECT        reviewer.Id AS UserId, reviewer.FirstName + ' ' + reviewer.LastName AS UserFullName, ali.Id, ali.PurchaseId, ali.PurchaseTypeId, ali.CreatedDate, ali.ModifiedDate, 
                         ali.BudgetFY, ali.Fund, ali.BudgetActivity, ali.DocSource, ali.DocumentNumber, ali.DocumentDate, ali.VendorName, ali.ObligatedAmt, ali.ReceivedAmt, ali.BilledAmt, 
                         ali.LastTransactionDate, ali.IsPriority
FROM            dbo.AccountingLineItem AS ali LEFT OUTER JOIN
                         dbo.Assignment AS a ON ali.Id = a.LineItemId LEFT OUTER JOIN
                         Auth.[User] AS reviewer ON a.AssignedReviewerUserId = reviewer.Id LEFT OUTER JOIN
                         Auth.[User] AS assignor ON a.AssignorUserId = assignor.Id

GO
ALTER TABLE [dbo].[AccountingLineItem] ADD  DEFAULT ((0)) FOR [IsPriority]
GO
ALTER TABLE [Auth].[AdminCodeAccess]  WITH CHECK ADD  CONSTRAINT [FK_AdminCodeAccess_User] FOREIGN KEY([UserRoleId])
REFERENCES [Auth].[UserRoles] ([Id])
GO
ALTER TABLE [Auth].[AdminCodeAccess] CHECK CONSTRAINT [FK_AdminCodeAccess_User]
GO
ALTER TABLE [Auth].[UserRoles]  WITH CHECK ADD  CONSTRAINT [FK_UserRoles_Role] FOREIGN KEY([RoleId])
REFERENCES [Auth].[Role] ([Id])
GO
ALTER TABLE [Auth].[UserRoles] CHECK CONSTRAINT [FK_UserRoles_Role]
GO
ALTER TABLE [Auth].[UserRoles]  WITH CHECK ADD  CONSTRAINT [FK_UserRoles_User] FOREIGN KEY([UserId])
REFERENCES [Auth].[User] ([Id])
GO
ALTER TABLE [Auth].[UserRoles] CHECK CONSTRAINT [FK_UserRoles_User]
GO
ALTER TABLE [dbo].[Assignment]  WITH CHECK ADD  CONSTRAINT [FK_Assignment_AccountingLineItem] FOREIGN KEY([LineItemId])
REFERENCES [dbo].[AccountingLineItem] ([Id])
GO
ALTER TABLE [dbo].[Assignment] CHECK CONSTRAINT [FK_Assignment_AccountingLineItem]
GO
ALTER TABLE [dbo].[Assignment]  WITH CHECK ADD  CONSTRAINT [FK_Assignment_User] FOREIGN KEY([AssignedReviewerUserId])
REFERENCES [Auth].[User] ([Id])
GO
ALTER TABLE [dbo].[Assignment] CHECK CONSTRAINT [FK_Assignment_User]
GO
ALTER TABLE [dbo].[Assignment]  WITH CHECK ADD  CONSTRAINT [FK_Assignment_User_Id] FOREIGN KEY([AssignorUserId])
REFERENCES [Auth].[User] ([Id])
GO
ALTER TABLE [dbo].[Assignment] CHECK CONSTRAINT [FK_Assignment_User_Id]
GO

Edit: By request, I've added two query plans to pastetheplan.com.

Slow Query: https://www.brentozar.com/pastetheplan/?id=SJegN-z5e

Fast Query: https://www.brentozar.com/pastetheplan/?id=rkH5sbGqe

In addition, there are indices on columns commonly used in queries and by the engine, and we continue to add indices as needed; however, there are no indices on the tables included in the sample query, besides those created with the PKs. You should be able to see other indices in the execution plan.

Best Answer

Late to the party but...

I suspect that bad statistics caused the the slow query to not granted enough memory, so it had to write to the tempdb. Updating your statistics should resolve.

If you look at the query plans, you'll notice a couple of things. First, there is a warning, which if you hover over and scroll down, will see the following details:

enter image description here

"Warnings: Operator used tempdb to spill data during the execution with spill level 1"

Second, while you are hovering over the various nodes of your query plans, take note of the estimated rows versus actual (11247). You'll find that the slow plan generally has lower estimate (315), compared to the fast query (4200). To me, that underestimation likely caused a too small grant, which led it to need to fall back to the slower tempdb.

More details from someone more well versed in the minutia than I:

Correct SQL Server TempDB Spills in Query Plans Caused by Outdated Statistics