I've got a schema something like having an Entity
table, with a to-many relationship to Attribute
, which has a to-many relationship to Value
. Attribute
has a column Name
, and Value
has a column Value
:
Entity
------
Id
Name
Attribute
---------
Id
EntityId
Name
Value
-----
Id
AttributeId
Value
I need to be able to construct a query that will return only Entity
rows that have certain combinations of attribute/value children. Currently I'm writing the query something like:
select
*
from Entity e
where exists(
select * from Attribute a
join Value v on v.AttributeId = a.id
where a.EntityId = e.id and a.Name = 'color' and v.Value = 'red')
and exists(
select * from Attribute a
join Value v on v.AttributeId = a.id
where a.EntityId = e.id and a.Name = 'size' and v.Value = 'small')
This works, but every extra attribute filter adds another exists
clause, and I'm nervous about performance. Is there a more efficient way to write queries of this type, or is it likely to be perfectly performant if I use correct indexes etc?
Best Answer
It's difficult to predict if your original query will have performance problems. We don't know how large your tables are, what indexes you have, how many filters you might add, how quickly the queries need to finish, or so on. You may want to consider writing the query in the most natural way to do and measuring performance. If performance is acceptable leave the code as is.
If you do need to improve performance, you can consider a rewrite similar to the following:
That will give you a single scan of the
attribute
andvalue
tables regardless of how many filters you add to theHAVING
clause. One downside is that you're likely to get a poor cardinality estimate from theHAVING
filter, but if the goal of the filters is to filter out most of the rows from theEntity
table then that may not be a problem.