Sql-server – varchar and nvarchar in tuning a stored procedure – how to improve performance in this scenario

execution-planperformancequery-performancesql serversql-server-2005table variable

I have the following procedure that is called over a million times a day, and I think it can be tuned for better resources usage.

ALTER PROCEDURE [DenormV2].[udpProductTaxRateGet]
(
    @itemNo varchar ( 20 ),
    @calculateDate datetime,
    @addressLine1 nvarchar( 50 ),
    @addressLine2 nvarchar( 50 ),
    @addressLine3 nvarchar( 50 ),
    @addressLine4 nvarchar( 50 ),
    @addressLine5 nvarchar( 50 ),
    @addressLine6 nvarchar( 50 ),
    @postalCode nvarchar( 20 ),
    @countryCode varchar( 2 ),
    @addressFormatID int
)
WITH EXECUTE AS 'webUserWithRW'
AS
--see Bocss2.dbo.[fnGetProductTax] for equivalent logic and comments in Bocss
DECLARE @Addresses TABLE (TaxRegionId int NOT NULL)

INSERT INTO @Addresses(TaxRegionId)
SELECT  DISTINCT TaxRegionId
FROM    dbo.[ShipTaxAddress]
WHERE   [CountryCode] = @countryCode
AND     [AddressFormatID] = @addressFormatID
AND     ISNULL (CONVERT(nvarchar(50),[MatchAddressLine1]), ISNULL(@addressLine1, '')) = ISNULL(@addressLine1, '')
AND     ISNULL (CONVERT(nvarchar(50),[MatchAddressLine2]), ISNULL(@addressLine2, '')) = ISNULL(@addressLine2, '')
AND     ISNULL (CONVERT(nvarchar(50),[MatchAddressLine3]), ISNULL(@addressLine3, '')) = ISNULL(@addressLine3, '')
AND     ISNULL (CONVERT(nvarchar(50),[MatchAddressLine4]), ISNULL(@addressLine4, '')) = ISNULL(@addressLine4, '')
AND     ISNULL (CONVERT(nvarchar(50),[MatchAddressLine5]), ISNULL(@addressLine5, '')) = ISNULL(@addressLine5, '')
AND     ISNULL (CONVERT(nvarchar(50),[MatchAddressLine6]), ISNULL(@addressLine6, '')) = ISNULL(@addressLine6, '')
AND     @postalcode Like ISNULL ( CONVERT(nvarchar(20),[MatchPostalCode]), @postalcode)




SELECT DISTINCT ISNULL(pst.TaxCode, '') as TaxCode
     , ISNULL(pst.TaxRate, 0) as TaxRate
FROM    dbo.[ProductShipTax] pst
        INNER JOIN
        @Addresses a
            ON pst.TaxRegionId = a.TaxRegionId
WHERE   pst.[ItemNo] = @itemNo
AND     @calculateDate BETWEEN pst.[DateFrom] AND pst.[DateTo]

GO

this procedure inserts values into a table variable from the following table:
please note that the columns in the table are VARCHAR and for some reason the parameters of the stored procedure and inside the code as well, everything is converted to NVARCHAR.

IF OBJECT_ID('[dbo].[ShipTaxAddress]') IS NOT NULL 
DROP TABLE [dbo].[ShipTaxAddress] 
GO
CREATE TABLE [dbo].[ShipTaxAddress] ( 
[TaxRegionAddressId]  INT                              NOT NULL,
[TaxRegionId]         INT                              NOT NULL,
[CountryCode]         VARCHAR(2)                       NOT NULL,
[AddressFormatId]     INT                              NOT NULL,
[MatchAddressLine1]   VARCHAR(50)                          NULL,
[MatchAddressLine2]   VARCHAR(50)                          NULL,
[MatchAddressLine3]   VARCHAR(50)                          NULL,
[MatchAddressLine4]   VARCHAR(50)                          NULL,
[MatchAddressLine5]   VARCHAR(50)                          NULL,
[MatchAddressLine6]   VARCHAR(50)                          NULL,
[MatchPostalCode]     VARCHAR(20)                          NULL,
CONSTRAINT   [PK_ShipTaxAddress]  PRIMARY KEY CLUSTERED    ([TaxRegionAddressId] asc))

Please note that this table [dbo].[ShipTaxAddress] has less than 200 rows.

this is the other table:

sp_gettabledef 'dbo.ProductShipTax' — this is what I use to get the table definition. if you are interested the code is here for share.

IF OBJECT_ID('[dbo].[ProductShipTax]') IS NOT NULL 
DROP TABLE [dbo].[ProductShipTax] 
GO
CREATE TABLE [dbo].[ProductShipTax] ( 
[ProductShipTaxID]  INT              IDENTITY(1,1)   NOT NULL,
[DateFrom]          SMALLDATETIME                    NOT NULL,
[DateTo]            SMALLDATETIME                    NOT NULL,
[TaxRate]           DECIMAL(18,4)                    NOT NULL,
[ItemNo]            VARCHAR(20)                      NOT NULL,
[TaxCode]           VARCHAR(20)                          NULL,
[TaxRegionId]       INT                              NOT NULL,
CONSTRAINT   [PK_ProductShipTax]  PRIMARY KEY CLUSTERED    ([ProductShipTaxID] asc))

GO

CREATE NONCLUSTERED INDEX [IX_ProductShipTax_ITemNo_DateFrom_DateTo] 
   ON [dbo].[ProductShipTax] ([ItemNo] asc, [DateFrom] asc, [DateTo] asc)

CREATE NONCLUSTERED INDEX [idx_ProductShipTax__K7_K5_K2_K3_K1_K4_6_INCL] 
   ON [dbo].[ProductShipTax] ([TaxRegionId] asc, [ItemNo] asc, [DateFrom] asc, [DateTo] asc, [ProductShipTaxID] asc, [TaxRate] asc)
   INCLUDE ([TaxCode])

this is the XML execution plan of this procedure run according to the statement below:

    exec dbo.udpProductTaxRateGet 
@itemNo=N'35638956',
@calculateDate='Aug  8 2016  1:01:46:760PM',
@addressLine1=N'',
@addressLine2=N'',
@addressLine3=N'114 FORGE LN',
@addressLine4=N'',
@addressLine5=N'FEASTERVILLE TREVOSE',
@addressLine6=N'PA',
@postalcode=N'190537838',
@countryCode=N'US',
@addressFormatID=2

where do I start?

This is how I improved this procedure:

I have created the following indexes:

    CREATE INDEX IDX_ShipTaxAddress_ShipTaxAddress
    ON dbo.[ShipTaxAddress] (CountryCode,
                             AddressFormatID,
                             MatchPostalCode)
    INCLUDE (TaxRegionId,
             [MatchAddressLine1],
             [MatchAddressLine2],
             [MatchAddressLine3],
             [MatchAddressLine4],
             [MatchAddressLine5],
             [MatchAddressLine6])
    GO

     CREATE NONCLUSTERED INDEX IX_ProductShipTax_ITemNo_DateFrom_DateTo 
     ON [dbo].[ProductShipTax] (  [ItemNo] ASC  
                                , [DateFrom] ASC  
                                , [DateTo] ASC  )   
    INCLUDE (TaxRegionId ,TaxCode,TaxRate)
    WITH (DROP_EXISTING=ON)

I have changed the relevant columns in the table, from VARCHAR to NVARCHAR, in order to eliminate the need of conversion.
the Table became like this:

IF OBJECT_ID('[dbo].[ShipTaxAddress]') IS NOT NULL 
DROP TABLE [dbo].[ShipTaxAddress] 
GO
CREATE TABLE [dbo].[ShipTaxAddress] ( 
[TaxRegionAddressId]  INT                              NOT NULL,
[TaxRegionId]         INT                              NOT NULL,
[CountryCode]         VARCHAR(2)                       NOT NULL,
[AddressFormatId]     INT                              NOT NULL,
[MatchAddressLine1]   NVARCHAR(50)                         NULL,
[MatchAddressLine2]   NVARCHAR(50)                         NULL,
[MatchAddressLine3]   NVARCHAR(50)                         NULL,
[MatchAddressLine4]   NVARCHAR(50)                         NULL,
[MatchAddressLine5]   NVARCHAR(50)                         NULL,
[MatchAddressLine6]   NVARCHAR(50)                         NULL,
[MatchPostalCode]     NVARCHAR(20)                         NULL,
CONSTRAINT   [PK_ShipTaxAddress]  
PRIMARY KEY NONCLUSTERED ([TaxRegionAddressId] asc))
GO

I have changed the procedure:

ALTER PROCEDURE [DenormV2].[udpProductTaxRateGet]
(
    @itemNo varchar ( 20 ),
    @calculateDate datetime,
    @addressLine1 nvarchar( 50 ),
    @addressLine2 nvarchar( 50 ),
    @addressLine3 nvarchar( 50 ),
    @addressLine4 nvarchar( 50 ),
    @addressLine5 nvarchar( 50 ),
    @addressLine6 nvarchar( 50 ),
    @postalCode nvarchar( 20 ),
    @countryCode varchar( 2 ),
    @addressFormatID int
)
WITH EXECUTE AS 'webUserWithRW'
AS
--see Bocss2.dbo.[fnGetProductTax] for equivalent logic and comments in Bocss


SELECT @postalcode    = CASE WHEN @postalcode   = N'' THEN NULL ELSE @postalcode   END
SELECT @addressLine1  = CASE WHEN @addressLine1 = N'' THEN NULL ELSE @addressLine1 END
SELECT @addressLine2  = CASE WHEN @addressLine2 = N'' THEN NULL ELSE @addressLine2 END
SELECT @addressLine3  = CASE WHEN @addressLine3 = N'' THEN NULL ELSE @addressLine3 END
SELECT @addressLine4  = CASE WHEN @addressLine4 = N'' THEN NULL ELSE @addressLine4 END
SELECT @addressLine5  = CASE WHEN @addressLine5 = N'' THEN NULL ELSE @addressLine5 END
SELECT @addressLine6  = CASE WHEN @addressLine6 = N'' THEN NULL ELSE @addressLine6 END



SELECT TOP 1  ISNULL(pst.TaxCode, '') as TaxCode
            , ISNULL(pst.TaxRate, 0) as TaxRate
FROM    dbo.[ProductShipTax] pst
WHERE EXISTS (

    SELECT  TaxRegionId
    FROM    dbo.[ShipTaxAddress]
    WHERE   [CountryCode] = @countryCode
    AND     [AddressFormatID] = @addressFormatID

    AND ([MatchAddressLine1] = @AddressLine1 OR ([MatchAddressLine1] IS NULL AND @AddressLine1 IS NULL) )
    AND ([MatchAddressLine2] = @AddressLine2 OR ([MatchAddressLine2] IS NULL AND @AddressLine2 IS NULL) )
    AND ([MatchAddressLine3] = @AddressLine3 OR ([MatchAddressLine3] IS NULL AND @AddressLine3 IS NULL) )
    AND ([MatchAddressLine4] = @AddressLine4 OR ([MatchAddressLine4] IS NULL AND @AddressLine4 IS NULL) )
    AND ([MatchAddressLine5] = @AddressLine5 OR ([MatchAddressLine5] IS NULL AND @AddressLine5 IS NULL) )
    AND ([MatchAddressLine6] = @AddressLine6 OR ([MatchAddressLine6] IS NULL AND @AddressLine6 IS NULL) )

    AND (@postalcode = [MatchPostalCode]     OR ([MatchPostalCode]   IS NULL AND @postalcode   IS NULL) )
    AND  TaxRegionId = pst.TaxRegionId     

)
AND pst.[ItemNo] = @itemNo
AND @calculateDate BETWEEN pst.[DateFrom] AND pst.[DateTo]

GO

when comparing the following:

USE US16HSMMProduct_ORIGINAL
GO
exec dbo.udpProductTaxRateGet 
        @itemNo=N'31997299',
        @calculateDate='Aug  8 2016  1:01:46:760PM',
        @addressLine1=N'',
        @addressLine2=N'',
        @addressLine3=N'',
        @addressLine4=N'',
        @addressLine5=N'',
        @addressLine6=N'FL',
        @postalcode=N'',
        @countryCode=N'US',
        @addressFormatID=2
go
USE US16HSMMProduct_AFTER_CHANGES
GO
exec DenormV2.udpProductTaxRateGet
        @itemNo=N'31997299',
        @calculateDate='Aug  8 2016  1:01:46:760PM',
        @addressLine1=N'',
        @addressLine2=N'',
        @addressLine3=N'',
        @addressLine4=N'',
        @addressLine5=N'',
        @addressLine6=N'FL',
        @postalcode=N'',
        @countryCode=N'US',
        @addressFormatID=2
go

we get this:
enter image description here

here is the execution plan of the old procedure

here is the execution plan of the new procedure

Best Answer

  1. get rid of the conversions to nvarchar. Your tables are using varchar, change your parameters to be varchar also.
  2. Get rid of the logic like this:

    AND     ISNULL (CONVERT(nvarchar(50),[MatchAddressLine1]), ISNULL(@addressLine1, '')) = ISNULL(@addressLine1, '')
    

    SQL doesn't work as well when you use functions on the columns in the where clause. Instead do this (and remember we are getting rid of the nvarchar parameters.

    AND ([MatchAddressLine1] = @AddressLine1
         OR ([MatchAddressLine1] IS NULL and @AddressLine1 IS NULL) )
    

    Note your current logic would return a row where one is NULL and the other is ''. If you still need that logic you'll have to add two more OR options but it will still work. The optimizer can work much better with this type of logic.

  3. Also you might change from a table variable to a temp table. There is a significant difference that you can see here.

  4. Last option is you could get rid of the temp table/variable entirly and use a CTE.

    WITH Addresses AS (
        SELECT  DISTINCT TaxRegionId
        FROM    dbo.[ShipTaxAddress]
        WHERE   [CountryCode] = @countryCode
        AND     [AddressFormatID] = @addressFormatID
        AND     ([MatchAddressLine1] = @addressLine1
            OR ([MatchAddressLine1] IS NULL and @AddressLine1 IS NULL) )
        AND     ([MatchAddressLine2] = @addressLine2
            OR ([MatchAddressLine2] IS NULL and @AddressLine2 IS NULL) )
        AND     ([MatchAddressLine3] = @addressLine3
            OR ([MatchAddressLine3] IS NULL and @AddressLine3 IS NULL) )
        AND     ([MatchAddressLine4] = @addressLine4
            OR ([MatchAddressLine4] IS NULL and @AddressLine4 IS NULL) )
        AND     ([MatchAddressLine5] = @addressLine5
            OR ([MatchAddressLine5] IS NULL and @AddressLine5 IS NULL) )
        AND     ([MatchAddressLine6] = @addressLine6
            OR ([MatchAddressLine6] IS NULL and @AddressLine6 IS NULL) )
        AND    (@postalcode IS NULL OR [MatchPostalCode] = @postalcode)
    )
    SELECT DISTINCT ISNULL(pst.TaxCode, '') as TaxCode
         , ISNULL(pst.TaxRate, 0) as TaxRate
    FROM    dbo.[ProductShipTax] pst
            INNER JOIN Addresses a
                ON pst.TaxRegionId = a.TaxRegionId
    WHERE   pst.[ItemNo] = @itemNo
    AND     @calculateDate BETWEEN pst.[DateFrom] AND pst.[DateTo]
    

You should check my code and make sure the logic is going to be correct but I believe it will be. Also use something like SET STATISTICS IO ON to get a good time (in milliseconds) of how long it runs before and after. Stranger things have happened than your code would be faster than mine.