Sql-server – Cannot tune database any further; what next

performanceperformance-tuningquery-performancesql serversql-server-2008-r2

We have a vendor-supplied application. It is under support and we are talking to their developers (we have their largest database size by an order of magnitude), but in the meantime, we have the following query that is run thousands of times a day, with the only parts changing the ABCDEFG12345 in the where clause. Almost everything in the where's are customisable, as it is the query generated by their master job search.

There are about 500,000 records in the dJobs table, and similar numbers of records in the joined tables:

select * from (
    select   top       20 * from (
        select  top       20 
            (       --Start Count
                select count(*) from dJobs
                left join dClients on cltClientID = jobClientId 
                left join dJobStatus ON jbsID = jobJobStatus    

                where ((jobSupervisor=         -1.00000000  or jobCoordinator=         -1.00000000 )  OR 1=1 ) 

                and 
                (
                (jobjobStatus=0 OR (0=0  AND 0=0)) 
                AND (jbsType=0 OR (0=0  AND 0=0)) 
                AND (jobPriority=0 OR (0=0  AND 0=0)) 
                AND (jobEventID=0 OR (0=0 AND 0=0))
                AND (jobSiteCode='' OR (''='' AND ''=''))
                AND (jobjobStatus IN (
                    select jbsid from djobstatusgroupmapping where jsgid = 0
                    )               
                 OR (0=0 AND 0=0))
                   AND jobDateCreated BETWEEN '2004-06-10' AND '2014-06-10'    AND jobBookedDate BETWEEN '1901-01-01 00:00:00' AND '2024-06-10 23:59:00'    
        and
        (('ABCDEFG12345'<>'' 
            AND
            ( 
                (1 = 1 AND 
                    (
                        jobWorkOrderNo like '%ABCDEFG12345%' OR
                        jobSiteName like '%ABCDEFG12345%' OR
                        jobSiteLocationBuilding like '%ABCDEFG12345%' OR
                        jobSiteAddress like '%ABCDEFG12345%' OR
                        jobSiteSuburb like '%ABCDEFG12345%' OR
                        jobSitePostcode like '%ABCDEFG12345%' OR
                        jobWorkToDo like '%ABCDEFG12345%' OR
                        jobSiteClient like '%ABCDEFG12345%' OR
                        jobSiteContact like '%ABCDEFG12345%' OR
                        jobSiteContactPhone like '%ABCDEFG12345%' OR
                        jobSiteContactPhone2 like '%ABCDEFG12345%'
                    )
                ) 
            OR 
                (1 = 1 AND 
                    (
                        cltClientName like '%ABCDEFG12345%' OR
                        cltDivision like '%ABCDEFG12345%' OR
                        cltAddress1 like '%ABCDEFG12345%' OR
                        cltAddress2 like '%ABCDEFG12345%' OR
                        rtrim(cltAddress1)+' '+rtrim(cltAddress2) like '%ABCDEFG12345%' OR
                        cltSuburb like '%ABCDEFG12345%' OR
                        cltPostCode like '%ABCDEFG12345%' OR
                        cltState like '%ABCDEFG12345%' OR
                        cltTelephone like '%ABCDEFG12345%' OR
                        cltContact like '%ABCDEFG12345%' OR
                        cltMobile like '%ABCDEFG12345%' OR
                        cltEmail like '%ABCDEFG12345%'
                    )
                ) 
            )
        ) 
        OR 'ABCDEFG12345'='')                
                )   --End Count
            )  as vRecCount, 
            jobID,
            jobWorkOrderNo,
            jobSequence,                        
            jobSiteClient,
            jobSiteStreetNumber,
            jobSiteAddress,
            jobSiteSuburb,
            jobSiteState,
            jobSitePostcode,                
            jobDateofLoss,
            jobDateofLossTime,
            jobDateCreated,
            jobTargetDate,
            jobClientID,            
            jobPriority,
            jobJobStatus,
            jobEventID,
            regFullname,
            case 
            when (not jobSupervisor=         -1.00000000 ) and jobCoordinator=         -1.00000000  then 1 else 
            case when (jobSupervisor=         -1.00000000 ) and (jobCoordinator=0) then 2 else   
            case when (jobSupervisor=         -1.00000000 ) and (not jobCoordinator=         -1.00000000 ) then 3 else 0
            end end end as 'Coordinator',
            jobSupervisor,
            jobBookedDate,
            jobLat,
            jobLong,
            jobClaimAssistRequest,
            jobBooked,
            jobSiteContact,
            jobSiteContactPhone2,
            jobPCM
        from dJobs 
        left join dClients on cltClientID = jobClientId     
        left join dUsers on regUserId = jobCoordinator  
        left join dJobStatus ON jbsID = jobJobStatus    

        where ((jobSupervisor=         -1.00000000  or jobCoordinator=         -1.00000000 )  OR 1=1 ) 

        and 
                (
                (jobjobStatus=0 OR (0=0  AND 0=0)) 
                AND (jobPriority=0 OR (0=0  AND 0=0)) 
                AND (jbsType=0 OR (0=0  AND 0=0)) 
                AND (jobEventID=0 OR (0=0 AND 0=0))
                AND (jobSiteCode='' OR (''='' AND ''=''))
                AND (jobjobStatus IN (
                    select jbsid from djobstatusgroupmapping where jsgid = 0
                    )               
                 OR (0=0 AND 0=0))                      
                )
                  AND jobDateCreated BETWEEN '2004-06-10' AND '2014-06-10'    AND jobBookedDate BETWEEN '1901-01-01 00:00:00' AND '2024-06-10 23:59:00'    
        and
        (('ABCDEFG12345'<>'' 
            AND
            ( 
                (1 = 1 AND 
                    (
                        jobWorkOrderNo like '%ABCDEFG12345%' OR                 
                        jobSiteName like '%ABCDEFG12345%' OR
                        jobSiteLocationBuilding like '%ABCDEFG12345%' OR
                        jobSiteAddress like '%ABCDEFG12345%' OR rtrim(jobSiteStreetNumber)+' '+rtrim(jobSiteAddress)  LIKE '%ABCDEFG12345%' OR
                        jobSiteSuburb like '%ABCDEFG12345%' OR
                        jobSitePostcode like '%ABCDEFG12345%' OR
                        jobWorkToDo like '%ABCDEFG12345%' OR
                        jobSiteClient like '%ABCDEFG12345%' OR
                        jobSiteContact like '%ABCDEFG12345%' OR
                        jobSiteContactPhone like '%ABCDEFG12345%' OR
                        jobSiteContactPhone2 like '%ABCDEFG12345%'
                    )
                ) 
            OR 
                (1 = 1 AND 
                    (
                        cltClientName like '%ABCDEFG12345%' OR
                        cltDivision like '%ABCDEFG12345%' OR
                        cltAddress1 like '%ABCDEFG12345%' OR
                        cltAddress2 like '%ABCDEFG12345%' OR
                        rtrim(cltAddress1)+' '+rtrim(cltAddress2) like '%ABCDEFG12345%' OR
                        cltSuburb like '%ABCDEFG12345%' OR
                        cltPostCode like '%ABCDEFG12345%' OR
                        cltState like '%ABCDEFG12345%' OR
                        cltTelephone like '%ABCDEFG12345%' OR
                        cltContact like '%ABCDEFG12345%' OR
                        cltMobile like '%ABCDEFG12345%' OR
                        cltEmail like '%ABCDEFG12345%'
                    )
                ) 
            )
        ) 
        OR 'ABCDEFG12345'='')
             order by  jobid desc ,  jobDateCreated desc  
    ) as newtbl   order by  jobid asc ,  jobDateCreated asc 
 ) as newtbl_2 order by  jobid desc, jobDateCreated desc

The SQL Server Database Tuning Advisor has no suggestions for me – so it looks like the developers have done a good job at creating appropriate indexes and statistics for the query.

Until they can refactor things at their end, which could take 6+ months, what's left for us to do? Just throw more money at the SQL server (SSDs, more RAM, perhaps Enterprise Edition for partitioning)?

Breakdown of transaction cost is:

  • Total CPU Time (ms.): 12,237.70
  • # Total Logical IO: 2,331,089
  • # Avg. Logical IO: 2,331,089.00
  • # Logical Reads: 2,331,089

SQL Server 2008 R2 Standard running inside a VMWare Virtual Machine.


Interesting part of the execution plan below. Most time spent on a clustered index scan.

enter image description here


Sorry for the delay; the following stats were taking from our staging server, which has 1/10th the number of records as production, but is also on lower class hardware. Execution Plan XML, STATISTICS IO:

Table 'dClients'. Scan count 5, logical reads 705780, physical reads 9, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'dJobs'. Scan count 10, logical reads 23386, physical reads 16, read-ahead reads 10601, lob logical reads 186928, lob physical reads 578, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'dUsers'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Best Answer

Fulltext isn't going to help without refactoring to use the full text functions ( CONTAINS, FREETEXT or their table equivalents ). It also doesn't really work with leading wildcard. Hacks are available, but basically you're going to struggle to write a semantically equivalent query for fulltext. For the future consider redesigning for fulltext which has stemming ( run, runner, running ) and thesaurus ( jogger ) which could serve your searches much better than two wildcards.

SSD is unlikely to help you unless you are memory bound. Your tables (at only 500k records) are probably in-memory most of the time. Can you confirm the size of the dJobs table, and server RAM?

Enterprise Edition could help where the limitation of 64GB RAM / lesser of 4 sockets or 16 cores goes up to 8, but you're going to need a really powerful box to notice a difference. For example, the 4 really means you could have something like 4 quad-core processors totalling 16 cores, with HT enabled, you're already at 32 logical processors. The general recommended server maxdop for this type of OLTP machine would be 8 anyway. I think this unlikely to benefit because your query has more fundamental problems but you never know.

Non-clustered indexes (particularly on dJobs) are unlikely to help because the query has so many columns from this table in the SELECT and many criteria in the WHERE clause. A non-clustered would have to be so wide to cover it would be practically a duplicate of the clustered index, therefore overly expensive to maintain. As the query sorts by jobID DESC, I considered a descending index but haven't trialled this.

Partitioning, (Enterprise only) is really a great feature, but again is unlikely to help you. I did a quick investigation of partitioning on dbo.dJobs.jobJobStatus column, eg I imagine you only have a small percentage of Jobs 'active' at any one time, eg a few hundred, even a few thousand from the 500,000 records. Partition elimination would probably be cancelled out by the OR OR OR approach. Parallel scans of multiple partitions are also an Enterprise feature:

This would probably work:

SELECT TOP 20 *
FROM dJobs
    LEFT JOIN dClients on cltClientID = jobClientId
    LEFT JOIN dUsers on regUserId = jobCoordinator
    LEFT JOIN dJobStatus ON jbsID = jobJobStatus
WHERE
    (
    jobjobStatus IN ( SELECT jbsid FROM djobstatusgroupmapping WHERE jsgid = 0 )
    )
ORDER BY jobID DESC

This probably won't work:

SELECT TOP 20 *
FROM dJobs
    LEFT JOIN dClients on cltClientID = jobClientId
    LEFT JOIN dUsers on regUserId = jobCoordinator
    LEFT JOIN dJobStatus ON jbsID = jobJobStatus
WHERE
    (
    jobjobStatus IN ( SELECT jbsid FROM djobstatusgroupmapping WHERE jsgid = 0 )
    OR ( 0=0 ) OR ( 0=0 )   --<< this 'OR always true' means 'get the whole table'
    )
ORDER BY jobID DESC

This leads me into the query. The OR OR OR approach basically means 'always get the whole table'. The TOP 20 masks this design problem. The TOP also probably pushed the plan towards Nested Loops which Jon suggested was suspect. What also stands out to me about this nightmareish "scan all columns" constructed query is that you bascially have two copies of the same query (and therefore tables), one to count, one for the resultset. This might be more efficient if the data went into an intermediate table and the count was done from there for example.

Finally, this brings me to the only only thing that would actually help you (without a large-scale refactor of the code): data deletion or archiving. As mentioned, I imagine you only have a small percentage of Jobs 'active' at any one time. Carve off the 'inactive' ones into a different table. Create a view over the top of the two tables for reporting. Set up a nightly job to copy out the old records.

Having only a few thousand active jobs in your main table will most likely transform your query performance.

Some recommended reading:

Erland Sommarskog's article on these "search all columns" queries Dynamic Search Conditions in T-SQL http://www.sommarskog.se/dyn-search-2008.html

Querying Multiple Columns (Full-Text Search) http://technet.microsoft.com/en-us/library/ms142488(v=sql.105).aspx

I hope that helps!