Sql-server – Slow Performance caused by Correlated Sub-queries & Clustered Index Seek

optimizationperformancesql server

I'm running a query in which one of the tables (FIELDDATA) has nearly 400,000 records which is taking more than 2 minutes to execute. I ran the execution plan, and noticed that the clustered seek index is hogging nearly all the time. I also ran the Database Engine Tuning Advisor to see if I could improve the performance in any way. However, it only suggested a 3% improvement. Neither does the SQL management windows throws a 'missing index' suggestion. How do I optimize this query? I understand this is because of the correlated subqueries. Any pointers on how should I re-write the query would help immensely.

Clustered Index on FIELDDATA(on keys ID, FIELDID, tableRow and TableCol).
Non-Clustered Indices on FIELDDATA are (all non-unique) (a. FIELDID, b. NVALUE, c. SVALUE) .

SELECT E.EVENTID "EVENTID",E.STEPCOUNT "STEPCOUNT",E.STATUSMODIFIED "STATUSMODIFIED",
    F.FORMID "FORMID",E.SUBJECT "SUBJECT",E.COPY "COPY",T.TRACKERID "DFS__TrackerID",
    CASE WHEN E.EVENTTYPE=5 OR F.FORMTYPE='fp' THEN 0 ELSE 1 END "ALLOWOPEN",

    (SELECT SVALUE FROM FIELDDATA WHERE ID=E.EVENTID AND FIELDID='AdminEmail') "ADMINEMAIL",
    (SELECT SVALUE FROM FIELDDATA WHERE ID=E.EVENTID AND FIELDID='ProcessID') "PROCESSID",
    (SELECT SVALUE FROM FIELDDATA WHERE ID=E.EVENTID AND FIELDID='NotificationType') "NOTIFICATIONTYPE",
    (SELECT SVALUE FROM FIELDDATA WHERE ID=E.EVENTID AND FIELDID='NotificationID') "NOTIFICATIONID",
    (SELECT SVALUE FROM FIELDDATA WHERE ID=E.EVENTID AND FIELDID='SCUserID') "SCUSERID",
    (SELECT SVALUE FROM FIELDDATA WHERE ID=E.EVENTID AND FIELDID='DueDate') "DUEDATE" 

FROM  EVENTS    E WITH (NOLOCK), 
    (SELECT EVENTID, EVENTDEFID, STATUS FROM EVENTS WITH (NOLOCK) WHERE EVENTTYPE<>6) E1, 
     FORMS     F1 WITH (NOLOCK), FORMS     F WITH (NOLOCK), ROUTEDEF  D WITH (NOLOCK), 
     TRACKER   T WITH (NOLOCK), CONTAINERS C WITH (NOLOCK)  

WHERE (T.SENDER=@P0 OR (E.OWNER=LOWER(@P1) AND 

    (EXISTS (SELECT ID FROM FIELDDATA WHERE E.EVENTID=ID AND ((TABLECOL='isCC' OR FIELDID='isCC') AND SVALUE = 'true')))))
     AND (E.EVENTTYPE=1 OR E.EVENTTYPE=4 OR E.EVENTTYPE=6 OR E.EVENTTYPE=5) 
     AND ((T.ACTION>=1 AND T.ACTION<7) OR T.ACTION=10 OR T.ACTION=11 OR T.ACTION=4000 OR T.ACTION=4001 OR T.ACTION=4003 OR T.ACTION=4004) 
     AND E.EVENTDEFID=F.FORMREV AND E1.EVENTID=E.EVENTID AND E1.EVENTDEFID=F1.FORMREV AND D.FORMGUID=F1.FORMGUID AND T.TASKID=E.TASKID AND C.CONTAINERID=E.LOCATION AND 

    ((EXISTS (SELECT ID FROM FIELDDATA WHERE E.EVENTID=ID AND ((TABLECOL='NotificationID' OR FIELDID='NotificationID') AND LEN(SVALUE) > 0)))) 
    AND ((DATEADD(day,-@P2,GETDATE()) < E.STATUSMODIFIED AND E1.STATUS<>1) OR (E1.STATUS=1)) ORDER BY DUEDATE DESC                        

Execution Plan :: https://drive.google.com/file/d/0B0p8mDJ5O2ztcGtjeWF6dUJTNFk/view?usp=sharing

EDIT :

I tried the same query in my local environment (SQL Server 2014) and noticed that the query was only taking 30 seconds. I've played with the indexes and have been able to bring down he time to 13 seconds. I can still see that the order by (sort clause) is taking 4-5 seconds time. I tried updating the statistics, however the warning is the SQL plan did not go.

Final Indexes on FIELDDATA:

  1. Clustered Index on FieldID, ID, Table Row, Table Col
  2. Non-Cl. Index: a)FieldId, ID including SVALUE; b)SVALUE including ID

I also tried restructuring the query, however both with and without CTE the query takes near about the same time (12-14 seconds).

Execution Plan with and without CTE are: https://drive.google.com/folderview?id=0B0p8mDJ5O2ztflRHcTRlZlpsT25LMkpEUDd0OFRWblFnTEsxY20wN0lNU3RTMTA1VTJKTFk&usp=sharing

The new query (with CTE) is :

WITH FieldData_CTE (EVENT_ID, AdEmail, ProcID, NotType, NotID, UsrId, DuDate) 
AS 
(
    Select EVENTID,
    (SELECT SVALUE FROM FIELDDATA WHERE ID=E.EVENTID AND FIELDID='AdminEmail') "ADMINEMAIL",
    (SELECT SVALUE FROM FIELDDATA WHERE ID=E.EVENTID AND FIELDID='ProcessID') "PROCESSID",
    (SELECT SVALUE FROM FIELDDATA WHERE ID=E.EVENTID AND FIELDID='NotificationType') "NOTIFICATIONTYPE",
    (SELECT SVALUE FROM FIELDDATA WHERE ID=E.EVENTID AND FIELDID='NotificationID') "NOTIFICATIONID",
    (SELECT SVALUE FROM FIELDDATA WHERE ID=E.EVENTID AND FIELDID='SCUserID') "SCUSERID",
    (SELECT SVALUE FROM FIELDDATA WHERE ID=E.EVENTID AND FIELDID='DueDate') "DUEDATE"
    From Events E
)

SELECT E.EVENTID "EVENTID",E.STEPCOUNT "STEPCOUNT",E.STATUSMODIFIED "STATUSMODIFIED",
F.FORMID "FORMID",E.SUBJECT "SUBJECT",E.COPY "COPY",T.TRACKERID "DFS__TrackerID",
CASE WHEN E.EVENTTYPE=5 OR F.FORMTYPE='fp' THEN 0 ELSE 1 END "ALLOWOPEN", CTE.AdEmail "ADMINEMAIL", CTE.ProcID "PROCESSID", CTE.NotType "NOTIFICATIONTYPE", 
  CTE.NotID "NOTIFICATIONID", CTE.UsrId "SCUSERID", CTE.DuDate "DUEDATE" 

FROM  EVENTS E WITH (NOLOCK), FieldData_CTE CTE, 
    (SELECT EVENTID, EVENTDEFID, STATUS FROM EVENTS WITH (NOLOCK) WHERE EVENTTYPE<>6) E1, 
 FORMS     F1 WITH (NOLOCK), FORMS     F WITH (NOLOCK), ROUTEDEF  D WITH (NOLOCK), 
 TRACKER   T WITH (NOLOCK), CONTAINERS C WITH (NOLOCK)  

WHERE  (T.SENDER=@P0 OR (E.OWNER=LOWER(@P1) AND 

    (EXISTS (SELECT ID FROM FIELDDATA WHERE E.EVENTID=ID AND ((TABLECOL='isCC' OR FIELDID='isCC') AND SVALUE = 'true')))))
 AND (E.EVENTTYPE=1 OR E.EVENTTYPE=4 OR E.EVENTTYPE=6 OR E.EVENTTYPE=5) 
 AND ((T.ACTION>=1 AND T.ACTION<7) OR T.ACTION=10 OR T.ACTION=11 OR T.ACTION=4000 OR T.ACTION=4001 OR T.ACTION=4003 OR T.ACTION=4004) 
 AND E.EVENTDEFID=F.FORMREV AND E1.EVENTID=E.EVENTID AND E1.EVENTDEFID=F1.FORMREV AND D.FORMGUID=F1.FORMGUID AND T.TASKID=E.TASKID AND C.CONTAINERID=E.LOCATION AND 

    ((EXISTS (SELECT ID FROM FIELDDATA WHERE E.EVENTID=ID AND ((TABLECOL='NotificationID' OR FIELDID='NotificationID') AND LEN(SVALUE) > 0)))) 
AND ((DATEADD(day,-@P2,GETDATE()) < E.STATUSMODIFIED AND E1.STATUS<>1) OR (E1.STATUS=1)) AND E.EVENTID=CTE.EVENT_ID ORDER BY DUEDATE

I cannot create any more indices on the Events table (as suggested in the plan)

Best Answer

The problem is that the query has to retrieve 13.7 billion rows from FIELDDATA in aggregate. This is caused by two significant issues with your query. One will be easy to solve, one will be harder.

The easy problem is with the index. As one commented suggested, you need to add a nonclustered index on FIELDDATA(ID, FIELDID). The clustered index scan is pulling 5764 rows each time it's called, then filtering down to the one row it needs to return. The index should allow it to pull only one row each time it's called, reducing the total to around 2.3 million.

The larger, more difficult problem is that you're using subqueries to return column data in the SELECT clause. SQL Server processes the SELECT clause after most of the other clauses in the query. In this case, the FROM (JOIN) clause would be processed, then the WHERE clause would filter certain rows from those returned by JOIN. The SELECT clause is then applied to each of the rows returned by WHERE. In this case, that's 395,939 rows. Each subquery in the SELECT clause is run once for each row - since you have subqueries on 6 columns, you are querying FIELDDATA 2.3 million times each time your query is run.

To reduce that, you need to get those subqueries out of the SELECT clause and into something that can be joined in your FROM clause. If you're using SQL Server 2005 or later, my best suggestion would be to create a common table expression (or subquery) with a PIVOT clause to bring all of the SVALUE values for each ID into one row. If you're using an earlier version, your best bet may be to extract the FIELDDATA data into a temporary table prior to this query, again with one row per ID that contains all of the necessary SVALUE values.