Sql-server – Optimisation of complex query – 100% cost index seek

performancequery-performancesql serversql-server-2008

I'm a c# developer, however I'm quite experienced with SQL. I have had a bit of experience with optimising queries but I've come up across one that has me stumped.

The query takes about 3 minutes to run for approximately 20,000 rows.
The rows start streaming immediately but for the 3 minute period, the SQL server cpu is pegged at 100%.

In this particular case, my queries performed well until the client requested I test them with 20k rows. These 20k rows (Clients) were distributed amongst 20 Practitioners (so roughly 1000 each)

Here's my database schema for the relevant tables:

enter image description here

The query uses a Recursive CTE to gather, given a source practitioner, the (n-level) tree of clients below them. It then joins to some FK type tables to collect additional information.

It appears the slow-down is coming from the join to the authority table, which represents which practitioners manage which clients.

enter image description here

pooling.ClientAccountInheritedRateGroup, pooling.ClientAccountFirstContact and pooling.PractitionerAccountFirstContact are all views which all come back with sub-second selects for the full data set.

Right, now for the actual query (sorry, it's a bit of a monster):

with h (distance, id, name, parentid, [path]) as ( 
select 
    0,
    pa.id,
    pa.name,
    null,
    convert(varchar(80), ltrim(str(pa.id))) as node_seq
from     
    pooling.practitioneraccount pa
    left outer join pooling.practitioneraccountrelation as par
        on pa.id = par.childid
where  
    pa.id = @PractitionerAccountId
    or (@PractitionerAccountId is null and par.parentid is null)

union all

select 
    1,
    -1,
    'Unmanaged Clients',
    -1,
    '-1'
where @practitioneraccountid is null

union all

select   
    distance + 1,
    pa.id,
    pa.name,
    par.parentid,
    convert(varchar(80), h.path + ',' + ltrim(str(pa.id)))
from     
    pooling.practitioneraccount pa
    inner join pooling.practitioneraccountrelation as par
        on pa.id = par.childid
    inner join h on par.parentid = h.id
)

select  
    distance as Depth,
    h.id as PractitionerAccountId,
    h.name as PractitionerAccountName,
    pafc.FullName as PractitionerAccountContactName,
    h.parentid as PractitionerAccountParentId,
    path as PractitionerAccountPath,
    ca.id as ClientAccountId,
    ca.name as ClientAccountName,
    ca.irdnumber as ClientAccountIrdNumber,
    bd.id as ClientAccountBalanceDateId,
    bd.day as ClientAccountBalanceDateDay,
    bd.month as ClientAccountBalanceDateMonth,
    cafc.FirstName as ClientAccountContactFirstName,
    cafc.LastName as ClientAccountContactLastName,
    cafc.PhoneNumber as ClientAccountContactPhoneNumber,
    cafc.MobileNumber as ClientAccountContactMobileNumber,
    ca.TownOrCity as ClientAccountLocation,
    rr.Rate as ClientAccountRwtRate,
    rr.Id as ClientAccountRwtRateId,
    cairg.RateGroupId as ClientAccountInheritedRateGroupId,
    cairg.RateGroupName as ClientAccountInheritedRateGroupName,
    ag.id as ClientAccountAssociatedGroupId,
    ag.name as ClientAccountAssociatedGroupName,
    agca.uniqueid as ClientAccountAssociatedGroupUniqueId,
    agca.approved as ClientAccountAssociatedGroupApproved
from    
    h
    left outer join security.practitioneraccountauthority as paa 
        on (h.id = paa.practitioneraccountid)
    left outer join pooling.clientaccount as ca 
        on ( 
            paa.clientaccountid = ca.id 
            or (h.id = -1 and ca.id not in (
                select clientaccountid 
                from security.practitioneraccountauthority
                )
            )
        )
left outer join config.RwtRate as rr  on ca.RwtRateId = rr.Id
left outer join config.balancedate as bd on (ca.balancedateid = bd.id)
left outer join pooling.ClientAccountInheritedRateGroup as cairg 
    on (ca.id = cairg.ClientAccountId)
left outer join security.associatedgroupclientaccount as agca 
    on (ca.id = agca.clientaccountid)
left outer join security.associatedgroup as ag 
    on (agca.associatedgroupid = ag.id)
left outer join pooling.ClientAccountFirstContact as cafc 
    on (cafc.ClientAccountId = ca.Id) 
left outer join pooling.PractitionerAccountFirstContact as pafc 
    on (pafc.PractitionerAccountId = paa.PractitionerAccountId)

The CTE in the query gives us the full tree of practitioners, which is then joined via the authority table to the clients under them. The CTE portion of the query returns almost instantly, only when we start joining to the clients does the performance issue appear.

I've added an index (and statistics) onto the ClientAccountId column of the authority table

CREATE NONCLUSTERED INDEX [IX_PractitionerAccountAuthority_ClientAccountId]
     ON     [security].[PractitionerAccountAuthority] 
(
    [ClientAccountId] ASC
)

Please ask if you need additional information such as the definition of the views mentioned above.

Best Answer

Here is your description of the problem: "The CTE portion of the query returns almost instantly, only when we start joining to the clients does the performance issue appear".

One possible, and likely, explanation is as follows: the optimizer fails to estimate the cardinality of your subtree, and chooses an inefficient plan. With your way of storing hierarchies, this is no surprise. How would you yourself estimate the size of a subtree without actually retrieving it?

Can you use materialized path? Getting a subtree using materialized path is essentially one range scan, fast and simple, and the optimizer can have a good cardinality estimate off the statistics on one index.

In my experience, your way of storing/reading hierarchies does not scale up. I have never been able to make it work fast and use resources efficiently.