Sql-server – Need help to optimize this slow performance Query

azure-sql-databasesql server

Need help to refactor a query. I need to get a result from four tables and structure looks like this

  1. Resource table

Id | Leader | Title | Author |

  1. ResourceField Table

Id | ResourceId | Position | ResourceFieldValueId |

  1. ResourceFieldValue Table

Id | Tag | Ind1 | Ind2 | Normalized | Subfields | SpecialTag |

  1. ResourceHolding Table

Id | Barcode | ResourceId | LocationId | CallNumber | ResourceCategoryId | ResourceStatusId | Cost |

Relation between tables:

These tables are to hold library's resource information. Resource represents book, ResourceHolding represents individual copies of a book, ResourceFieldValue represents properties of books and ResourceField holds relation between Resource and ResourceFieldValue. So,

  1. A resource can have multiple Resource Holding records
  2. A resource can have multiple ResourceFieldValues in ResourceField table

Note: In ResourceFieldValue table, each property of a book/resource represents by three digit numeric value (saved as a string) in Tag column and some of the Tags are grouped into one special group. For example these tags represents ('100', '110', '700', '800', '920') Author group, Values in these tags represent ('130', '222', '242', '245', '246', '730', '830') Title of a resource/book.

Query I am having issue with:

I need to get x number of unique holdings with Author's name and Title from ResourceFieldValue.. I have Title and Author in Resource table but I can't use that.

This the query so far I have come up with but it is still slow.. I have around 40,000 holdings and 360k+ ResourceFieldValue records. With this query, right now it takes around 30 second to get results.

SELECT DISTINCT
h.ResourceId, 
h.Id AS HoldingId, 
h.Barcode AS [BAR], 
h.Call AS [CAL], 
h.ResourceCategoryId as CategoryId,
fvAUT.Normalized fAUT,
fvAUT.Subfields sf_AUT,
fvTIT.Subfields sf_TIT
FROM ResourceHolding h
OUTER APPLY (SELECT TOP 1 Subfields, Normalized FROM ResourceField rfAUT INNER JOIN ResourceFieldValue rfvAUT ON rfAUT.ResourceFieldValueId = rfvAUT.Id AND rfvAUT.Tag IN ('100', '110', '700', '800', '920') AND ResourceId = h.ResourceId)  fvAUT
OUTER APPLY (SELECT TOP 1 Subfields, Normalized FROM ResourceField rfTIT INNER JOIN ResourceFieldValue rfvTIT ON rfTIT.ResourceFieldValueId = rfvTIT.Id AND rfvTIT.Tag IN ('130', '222', '242', '245', '246', '730', '830') AND ResourceId = h.ResourceId)  fvTIT
 WHERE h.ResourceStatusId != 7
ORDER BY fAUT asc
OFFSET 0 ROWS FETCH NEXT 10000 ROWS ONLY

Let me know if you need more info to understand table structure. Any help would be greatly appreciated.

Thanks

Best Answer

we need to know more information as Justin mentioned in the comment to answer you better, however meanwhile try this method, it improve performance a little bit better:

with cte_ as (
select ResourceId
    , max (case when rfvAUT.Tag IN ('100', '110', '700', '800', '920')  then Subfields else null end) as sub1
    , max (case when rfvAUT.Tag IN ('100', '110', '700', '800', '920')  then Normalized else null end)  as Norm1
    , max (case when rfvAUT.Tag IN ('130', '222', '242', '245', '246', '730', '830') then Subfields else null end) as sub2
    , max (case when rfvAUT.Tag IN ('130', '222', '242', '245', '246', '730', '830') then Normalized else null end)  as Norm2
FROM 
    ResourceField rfAUT 
    INNER JOIN ResourceFieldValue rfvAUT 
        ON rfAUT.ResourceFieldValueId = rfvAUT.Id 
    group by 
        rfvAUT.ResourceId 
)


SELECT DISTINCT
    h.ResourceId, 
    h.Id AS HoldingId, 
    h.Barcode AS [BAR], 
    h.Call AS [CAL], 
    h.ResourceCategoryId as CategoryId,
    cte_.norm1 fAUT,
    cte_.sub1 sf_AUT,
    cte_.sub2 sf_TIT
FROM ResourceHolding h
left join cte_ 
 on cte_.ResourceId = h.ResourceId
WHERE h.ResourceStatusId != 7
ORDER BY fAUT asc
OFFSET 0 ROWS FETCH NEXT 10000 ROWS ONLY