Sql-server – Advice on how to improve the efficiency of a stored procedure using lots of joins

performancequery-performancesql serversql-server-2012stored-procedures

I have a stored procedure that is returning search results for items in a database grouped by customers who own them.

The user performing the search is restricted as to what items they can see by the category permissions and entity permissions.

The user category permissions are passed in on the @CategoryPermissions parameter and are filtered against the tariff the customer is on.

Items are associated with entities (via a number of combinations of BusinessID and PIN, SubID, AccountNumber or GroupSubID)

Users have to attest to these entities and to be able to see the items. What they have attested to is passed into the stored procedure as the @EntityPermissions.

The results will also indicate if the user has already viewed the item before via the ReadStatus table

The stored procedure also has the option filter by 'read' flag which can be in 3 states

  • return all items
  • only unread items or
  • only read items.

The read status table will only contain a row where the item has been read, or 'marked as unread' so I am using an EXISTS and NOT EXISTS to filter by the @Read flag.

We will only return the top number of rows specified in the @RecordLimit parameter, but we need to return the total number of items found. This is being done by a COUNT(*) OVER() as ResultCount, and by loading the results into a temporary table we can select that value and output it as 'TotalCount'

Is there any way to alter this stored procedure to be more efficient? If I could avoid using a temporary table, or the EXIST and NOT EXISTS extra query on the read status table it would improve the execution time. Any advice would be welcomed!

Here is the stored procedure source:

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE PROCEDURE [dbo].[uspMetaDataSearchItems]
(
    @TariffNames varchar(8000),
    @Category varchar(8000),
    @SearchText varchar(50) = null,
    @SupervisorName varchar(50) = null,
    @StartDate datetime = null,
    @EndDate datetime = null,
    @SortByColumn varchar(50) = null,
    @RecordLimit int,
    @StatusCheck bit,
    @ExpirationCheck bit,
    @CategoryPermissions CategoryPermissionTableType READONLY,
    @ApplyCategoryPermissions BIT = 0,
    @EntityPermissions EntityPermissionTableType READONLY,
    @ApplyEntityPermissions BIT = 0,
    @ExpirationStartDate datetime = null,
    @ExpirationEndDate datetime = null,
    @Read BIT = null,
    @UserID INT = null
)
AS
BEGIN

    SET NOCOUNT ON

    DECLARE @tblTariffs AS TABLE(TariffName varchar(50))
    IF (@TariffNames IS NOT NULL)
        INSERT INTO @tblTariffs
        SELECT split.a.value('.', 'VARCHAR(100)') 
        FROM   (SELECT Cast ('<M>' + Replace(@TariffNames, ',', '</M><M>') +             '</M>' 
            AS XML 
            ) AS String 
        ) AS A 
        CROSS apply string.nodes ('/M') AS Split(a)

    DECLARE @tblCategory AS TABLE(Category varchar(50))
    IF (@Category IS NOT NULL)
        INSERT INTO @tblCategory
        SELECT split.a.value('.', 'VARCHAR(100)') 
        FROM   (SELECT Cast ('<M>' + Replace(@Category, ',', '</M><M>') +'</M>' 
        AS XML 
                ) AS String 
        ) AS A 
    CROSS apply string.nodes ('/M') AS Split(a)             

    SELECT @StartDate = ISNULL(@StartDate,CONVERT(datetime,'1970-01-01')),
        @EndDate = ISNULL(@EndDate,CONVERT(datetime,'2100-01-01'))

    IF @SortByColumn IS NULL OR @SortByColumn NOT IN ('last_item_received', 'tariff', 'customer_last_name')
        SET @SortByColumn = 'customer_last_name'

SELECT 'CustomerRecords' As "RecordSetName"

DECLARE @tmp table(member_id varchar(50), Customer_first_name varchar(50),Customer_last_name varchar(50),date_of_birth varchar(50),tariff varchar(50),customer_supervisors_name varchar(50), number_of_Items int, read_item_count int, last_item_received datetime, ResultCount INT);

INSERT INTO @tmp
SELECT TOP (@RecordLimit) 
    md.[CustomerMemberID] AS member_id,
    md.[CustomerFirstName] AS Customer_first_name,
    md.[CustomerLastName] AS Customer_last_name,
    md.CustomerDateOfBirth AS date_of_birth,
    md.[TariffName] AS tariff,
    MIN(ISNULL(SupervisorName,'')) as 'customer_supervisors_name',
    Count(DISTINCT md.ItemID) as 'number_of_items',
    -- Read Item Count is only available if a UserID is provided
    ISNULL(COUNT(DISTINCT (CASE WHEN rs.ReadIND <> 0 THEN md.ItemID END)), 0) as 'read_item_count',
    Max(md.CreatedDTM) as 'last_item_received',
    COUNT(*) OVER() as ResultCount
FROM [dbo].[tblMetaData] AS md WITH (NOLOCK) 
--Join To Get Read Item Count
LEFT OUTER JOIN [dbo].[tblReadStatus] as [rs] WITH (NOLOCK) on md.ItemID = rs.ItemID and rs.UserID = @UserID 
-- Apply permissions
LEFT JOIN @CategoryPermissions cp ON md.TariffName = cp.TariffName AND md.ItemCategory = cp.ItemCategory
LEFT JOIN [dbo].[tblEntityPermissionLink] as pl WITH (NOLOCK) on md.ItemID = pl.ItemID
LEFT JOIN @EntityPermissions pp ON md.TariffName = pp.TariffName AND pl.BusinessId = pp.BusinessId AND (pl.PIN = pp.PIN OR pl.SubID = pp.SubID OR pl.AccountNumber = pp.AccountNumber OR pl.GroupSubID = pp.GroupSubID)
-- Other filters
WHERE [md].[CreatedDTM] BETWEEN @StartDate AND @EndDate
AND [ItemExpiryDTM] BETWEEN ISNULL(@ExpirationStartDate,CONVERT(datetime,'1970-01-01')) AND ISNULL(@ExpirationEndDate,CONVERT(datetime,'2100-01-01'))
AND md.[ItemType] = 'Customer Item'

-- Apply read filter for user - Only filter if a UserID is supplied. @Read IS NULL (Return all items), @Read = 0 (Return only unread items), @Read = 1 (Return only read items)
AND (@UserID IS NULL OR (
        @Read IS NULL OR (
            (@Read=1 AND EXISTS (SELECT [rs].[ItemID] FROM [dbo].[tblReadStatus] AS [rs] WITH (NOLOCK) WHERE [rs].[ReadIND]=1 AND [rs].[ItemID] = [md].[ItemID] AND [rs].[UserID] = @UserID))
            OR
            (@Read=0 AND NOT EXISTS (SELECT [rs].[ItemID] FROM [dbo].[tblReadStatus] AS [rs] WITH (NOLOCK) WHERE [rs].[ReadIND]=1 AND [rs].[ItemID] = [md].[ItemID] AND [rs].[UserID] = @UserID))
        )
        ))
    AND ((@SupervisorName IS NULL) OR md.SupervisorName = @SupervisorName)
    AND ((@SearchText IS NULL) OR (md.CustomerLastName LIKE @SearchText + '%'))
    AND ((@ExpirationCheck IS NULL) OR (md.[ItemExpiryDTM] > GetDate()))
    AND ((@Category IS NULL) OR (md.ItemCategory IN (SELECT Category FROM @tblCategory)))
    AND ((@TariffNames IS NULL) OR (md.TariffName IN (SELECT TariffName FROM @tblTariffs)))

    -- Permissions
    AND (@ApplyCategoryPermissions = 0 OR cp.ItemCategory IS NOT NULL)
    AND (@ApplyEntityPermissions = 0 OR (pp.GroupSubID IS NOT NULL OR pp.AccountNumber IS NOT NULL OR pp.PIN IS NOT NULL OR pp.SubID IS NOT NULL))
    GROUP BY md.CustomerMemberID,md.CustomerFirstName,md.CustomerLastName,md.TariffName,md.CustomerDateOfBirth
    ORDER BY
        (CASE ISNULL(@SortByColumn,'')
            WHEN 'last_item_received' THEN Max(md.CreatedDTM)
        END) DESC,
        (CASE ISNULL(@SortByColumn,'')
            WHEN 'tariff' THEN md.[TariffName]
            WHEN 'customer_last_name' THEN md.[CustomerLastName]
        END) ASC

    -- Return the top results data set
    SELECT * FROM @tmp 

    -- Return the total row count
    IF EXISTS (SELECT top 1 ResultCount from @tmp) 
        SELECT TOP 1 ResultCount AS 'TotalCount' FROM @tmp   
    ELSE
        SELECT 0 AS 'TotalCount'

    SET NOCOUNT OFF

    SELECT '' As "RecordSetName"

END

GO

Best Answer

This is a "kitchen sink" query, for which SQL Server MVP Aaron Bertrand has a good video on how to optimize using dynamic SQL and a detailed post here.

A few points to get you started on the performance of your query:

  • Use dynamic SQL to simplify the @Read criteria and subsequent lookup in tblReadStatus. I'm guessing that this is going to be your main performance gain.
  • tblMetaData would benefit from having an index on (ItemType, CreatedDTM, ItemID) INCLUDE (TariffName, ItemCategory, SupervisorName, CustomerLastName, ItemExpiryDTM, CustomerMemberID, CustomerFirstName, CustomerLastName, CustomerDateOfBirth). If this table is large, a good index makes a huge performance difference. A clustered index (as opposed to non-clustered), obviously, won't need INCLUDE columns, just the indexed ones.
  • If you haven't done so already, put clustered indexes on the key column(s) of your table-type variables (you can define PRIMARY KEY CLUSTERED when you're creating the table type).
  • Unique index on tblEntityPermissionLink (ItemID) INCLUDE (BusinessID, PIN, SubID, AccountNumber, GroupSubID)
  • Unique index on tblReadStatus (UserID, ItemID) INCLUDE (ReadIND)
  • Really consider if those WITH (NOLOCK) are any good to you. This is not neccessarily performance-related, rather a best-practices note because they can cause you a lot of problems with regards to dirty reads.