Efficient Query for Advanced EAV Model in SQL Server

eavsql servert-sql

I'm kind of stuck on how to build an effecient query which returns the data of the following EAV structure.

Today, a product table already exists, containing 4 fixed fields. We want to upgrade the system, allowing an infinite amount of additional product fields, defined by the manufacturer of each product. We call these additional fields 'parameters'.

A parameter can be of the following data type:

  • Text
  • Date (range)
  • Boolean
  • Dropdownvalue (single)
  • Dropdownvalues (multiple)

Based on that, I created the following database model:

Database Scheme
:

  • Product: the original product table, containing the fixed fields.
  • Parameter: this table defines all types of parameters. Currently 5: Text, Date, Boolean, Dropdown value (single), Dropdown value (multiple)
  • ManufacturerParameter: this table stores which parameters are defined by the product manufacturer (Manufacturer table not included in the picture) for all of his products.
  • ProductParameter: this is where the actual product data of all the defined parameters is being stored. A parameter of the type 'Text', stores its information in the field 'Text', 'Date' in 'DatumBegin' and 'DatumEnd', Bit/Boolean in 'Boolean'.
  • ParameterValueListItem: this table contains the pre-defined values of the Dropdownlist values for both single and muliple coupling. These are the values you get to see in the Dropdownlist when creating or altering a product.
  • ProductParameterValueListItem: this is where the actual product data is stored for the Dropdownlist values.

I need to build a webpage in ASP.NET C# which shows a list of all product data, including each defined parameter with it's value for that product. I need to be able to filter the list by searching on any field/parameter. This list also needs to be sortable on any field/parameter.

I tried to filter the list by outer joining Product – ProductParameter – ProductParametervalueListItem – ParameterValueListItem to obtain a table result containing all data, and then filtering that list by using a dynamic where clause, built and passed through by the web app. This works when you search on 1 parameter but when you start searching on multiple parameters, you get incorrect results because each parameter is a different line in the table result and there simply is no 'AND'-match between parameter A and B, because they don't exist in the same record (line).

Can anyone advice me how this can be achieved? The most perfect solution would be 1 table result, containing all data and each parameters being shown as a column in that same product row. Is this too complex to handle at database level?

Thanks for reading this far. Any advice is welcome.

Best Answer

First of all, what you are about to design is probably a VERY bad idea. A much better solution would be to have a dynamic schema where you add new tables and have the application understand how to query those table (you could place them in a schema). This largely avoids all the locking and query plan issues you are bound to run into with this model. There is nothing wrong with applications running CREATE TABLE now and again.

Second, I am not sure I understand why you have normalised Parameter into its own table? Why not put that directly into the ManufacturerParameter table.

Third, if you insist on proceeding with your current model, there are ways to achieve what you want (at least if I am interpreting your requirement correctly). What you can do is to write your query in such a way that it is a sums up the search argument when there is a match and then use HAVING to filter out the values that match. I am assuming that only one of the fields Text, Boolean, Datum etc are populated per ProductParameter record (you probably want to enforce this with a constraint)

For example, to search for all products that have a bolean = true for one parameter AND text = 'abc' for some other parameter you can do:

SELECT P.Name
FROM Product P
JOIN ProductParameter PP
WHERE P.ID = Foo
  AND PP.Boolean = 1 OR PP.Text = 'abc'  ... /* For each filter */
GROUP BY P.Name /* And any other things you want out of product */
HAVING COUNT(*) >= [Number of where clauses]

If you need to list all the parameters of this product, you can use the above query template as a nested query and join back to ProductParameter.

The above query CAN be optimised by maintaining a computed column in ProductParameter that has a string representation of the different data types in that table. That way, the above OR statements can be rewritten as an IN list (which you will want to pass as a table valued parameter).

I would like to repeat that what you are doing is probably very wrong. If you do it, you will most likely need to hand tune most of your query plans - the optimiser will not help you anymore. And that is assuming you don't have too many query variants, which will run your plan cache full.

Related Question