SQL Server – Query Index Optimization

index-tuningsql serversql-server-2016

I am trying to figure out how I can index this query below. The table is 155 columns. It has a clustered index on the column which is not in the query. The query is dynamically created from out of the box application and no way I cant tune it except adding index. I could not find an index that will optimize the query. The table has 511919 rows.Now, it is clustered index scan OR non clustered index Scan (with covered index)

SQL Server version : 2016 SP2, We have 22 core and 24 gbs on the server. I can post a query plan.

    Declare
        @P0 VARCHAR(8000)
        ,@P1 VARCHAR(8000)
        ,@P2 VARCHAR(8000)
        ,@P3 VARCHAR(8000)
        ,@P4 INT
        ,@P5 VARCHAR(8000)
        ,@P6 VARCHAR(8000)
        ,@P7 VARCHAR(8000)
        ,@P8 INT
        ,@P9 VARCHAR(8000)

Set @P0 = 'Cancelled'
Set @P1 ='20190321'
Set @P2 ='20190321'
Set @P3 ='20190321'
Set @P5 ='20190321'
Set @P6 ='20190321'
Set @P7 ='20190321'

Set @P4 =1
Set @P8 =0

Set @P9='Service'

SELECT  Col1,
    Col2,
    Col3,
    Col4,
    Col5,
    Col6,
    Col7,
    Col8,
    Col9,
    Col10,
    Col11,
    Col12,
    Col13,
    Col14,
    Col15,
    Col16,
    Col17,
    Col18,
    Col19,
    Col20,
    Col21,
    Col22,
    Col23,
    Col24,
    Col25,
    Col26,
    Col27,
    Col28,
    Col29,
    Col30,
    Col31,
    Col32,
    Col33,
    Col34,
    Col35,
    Col36,
    Col37
FROM Table1
WHERE (
        Col18 IS NOT NULL
        AND Col75 <> @P0
        AND DATEDIFF(dd, Col146, CAST(SUBSTRING(@P1, 1, 4) + '-' + SUBSTRING(@P2, 5, 2) + '-' + SUBSTRING(@P3, 7, 2) + 'T00:00:00.000' AS DATETIME)) <= @P4
        AND DATEDIFF(dd, Col146, CAST(SUBSTRING(@P5, 1, 4) + '-' + SUBSTRING(@P6, 5, 2) + '-' + SUBSTRING(@P7, 7, 2) + 'T00:00:00.000' AS DATETIME)) >= @P8
        )
    AND Col39 = @P9
ORDER BY 18 ASC
    ,31 DESC
    ,30 ASC

This is a covered index. It is used as non clustered index Scan.

/****** Object:  Index [NC_OBJ_DID_SQ_CoveredIndex2]    Script Date: 3/21/2019 1:30:31 PM ******/
CREATE NONCLUSTERED INDEX NC1 ON Table1
(
    TableCol39  ASC,
    TableCol6 ASC,
    TableCol75  ASC,
    TableCol18  ASC
)
INCLUDE (   TableCol1,
    TableCol2,
    TableCol3,
    TableCol4,
    TableCol5,
    TableCol6,
    TableCol7,
    TableCol8,
    TableCol9,
    TableCol10,
    TableCol11,
    TableCol12,
    TableCol13,
    TableCol14,
    TableCol15,
    TableCol17,
    TableCol19,
    TableCol20,
    TableCol21,
    TableCol22,
    TableCol23,
    TableCol24,
    TableCol25,
    TableCol26,
    TableCol27,
    TableCol28,
    TableCol29,
    TableCol30,
    TableCol31,
    TableCol32,
    TableCol33,
    TableCol34,
    TableCol35,
    TableCol36,
    TableCol37,
    TablCol146 ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

Thanks a bunch!

Best Answer

If you are unable to adapt the queries, some of these columns in the where clause will remain not sargable.

To make your query sargable, You would need to try and remove the DATEADD() functions in the where clause, around Col146. This makes the (possible) index on Col146 not sargable.

Consider this as a possible change:

AND Col146 <= DATEADD(dd,-@P4,CAST(SUBSTRING(@P1, 1, 4) + '-' + SUBSTRING(@P2, 5, 2) + '-' + SUBSTRING(@P3, 7, 2) + 'T00:00:00.000' AS DATETIME)) 
AND Col146 >= DATEADD(dd,-@P8,CAST(SUBSTRING(@P1, 1, 4) + '-' + SUBSTRING(@P2, 5, 2) + '-' + SUBSTRING(@P3, 7, 2) + 'T00:00:00.000' AS DATETIME)) 

Making sure that all functions are applied to the variables. You should not take my word for it and double check this.

Index changes.

If you could make your query sargable & the sort is not expensive.

CREATE INDEX IX_Table1_Col39_Col146_Col75
ON Table1(Col39,Col146,Col75)
INCLUDE(Col18 
,Col31 
,Col30 )
WHERE Col18  IS NOT NULL

However, If a lot of rows are returned after the search predicates, you might have to index your SORT (Order by) As well, since that one will use a lot of CPU depending on the amount of rows passing through. We would have to see an actual query plan to determine this.


We are not able to change the query, no matter how much we beg and plead for mercy

You could try and index that what is sargable and try to remove the SORT operator.

CREATE INDEX IX_Table1_Col39_Col75_Col18_Col31_Col30
ON Table1(Col39 ASC ,Col75 ASC ,Col18 ASC,Col31 DESC, Col30 ASC)
INCLUDE(  Col146 )
WHERE Col18  IS NOT NULL;

Note that the <> operator is not that sargable, you could probably try and change it to;

CREATE INDEX IX_Table1_Col39_Col146_Col75
ON Table1(Col39 ASC ,Col18 ASC,Col31 DESC, Col30 ASC)
INCLUDE(  Col146,Col75 )
WHERE Col18  IS NOT NULL;

What about the other columns?

The include list is something I will leave to you, depending on the expected resultset's, the key lookup could be something to live with or not.

The optimizer might not even use the index if it thinks that the key lookup would be too expensive.


Testing the theories

To see if what I am saying holds any truth, I conducted a test, ofcourse, YMMV. It is simply to see if it is even possible to remove the SORT operator and that we are able to seek on certain filters.

DDL + DML Below.

Index #1

CREATE INDEX IX_Table1_Col39_Col75_Col18_Col31_Col30
ON Table1(Col39 ASC ,Col75 ASC ,Col18 ASC,Col31 DESC, Col30 ASC)
INCLUDE(  Col146 )
WHERE Col18  IS NOT NULL;

While we are able to seek on col75 and col39, the sort + residual predicate remains:

enter image description here

And that is not what was expected unfortunately.

There is light at the end of the Tunnel (Index #2)

CREATE INDEX IX_Table1_Col39_Col18_Col31_Col30
ON Table1(Col39 ASC  ,Col18 ASC,Col31 DESC, Col30 ASC)
INCLUDE(  Col146,Col75)
WHERE Col18  IS NOT NULL;

I can rerun my modified test query (without all the selected columns)

         Declare
        @P0 VARCHAR(8000)
        ,@P1 VARCHAR(8000)
        ,@P2 VARCHAR(8000)
        ,@P3 VARCHAR(8000)
        ,@P4 INT
        ,@P5 VARCHAR(8000)
        ,@P6 VARCHAR(8000)
        ,@P7 VARCHAR(8000)
        ,@P8 INT
        ,@P9 VARCHAR(8000)


Set @P0 = '1blabla'
Set @P1 ='20190315'
Set @P2 ='20190315'
Set @P3 ='20190315'
Set @P5 ='20190315'
Set @P6 ='20190315'
Set @P7 ='20190315'

SET @P9 = '0bla'
Set @P4 =1
Set @P8 =1
SELECT *
FROM table1 Where
(
 Col18 IS NOT NULL
        AND Col75 <> @P0
           AND DATEDIFF(dd, Col146, CAST(SUBSTRING(@P1, 1, 4) + '-' + SUBSTRING(@P2, 5, 2) + '-' + SUBSTRING(@P3, 7, 2) + 'T00:00:00.000' AS DATETIME)) <= @P4
        AND DATEDIFF(dd, Col146, CAST(SUBSTRING(@P5, 1, 4) + '-' + SUBSTRING(@P6, 5, 2) + '-' + SUBSTRING(@P7, 7, 2) + 'T00:00:00.000' AS DATETIME)) >= @P8

     )
    AND Col39 = @P9
ORDER BY Col18 ASC
    ,Col31 DESC
    ,Col30 ASC;

To get a pretty seek with no sort or other operators (And a pesky residual predicate.)

enter image description here

Your result could look more like this:

enter image description here

Conclusion

Based on these tests, which may very well be 100% different for you, this is the index that I would initially go for:

CREATE INDEX IX_Table1_Col39_Col18_Col31_Col30
ON Table1(Col39 ASC  ,Col18 ASC,Col31 DESC, Col30 ASC)
INCLUDE(  Col146,Col75)
WHERE Col18  IS NOT NULL;

DDL + DML Used in this example

        SET NOCOUNT ON;
        DECLARE @i int = 0
        WHILE @i < 100
        BEGIN
        INSERT INTO Table1(Col39,   Col75,Col146,Col18,Col31,Col30)
        VALUES(cast(@i as varchar(3))+'bla',
               cast(@i as varchar(3))+'blabla',
               DATEADD(DD,-@i,GETDATE()),
               cast(@i as varchar(3))+'blablabla',
               cast(@i as varchar(3))+'blablablabla',
               cast(@i as varchar(3))+'blablablablabla');
        SET @i += 1;
        END
        SET NOCOUNT OFF ;
        INSERT INTO Table1
        SELECT * From Table1;
        GO 7
        Beginning execution loop

   (6500 rows)

    GO