Database Design – Advice on Performance and Subtypes

database-designperformancesubtypes

context :
- Teachers create problems to their Students
- A Problem have Numerals(Questions)
- A Numeral(Question) have Steps to guide the student to the answer
- Choice represent the type of mathematical abstraction that fits
  better
  - Solve: (unknown variable, starting expression)
  - Simplify: (reference, target)
  - Plot: (slope,b)
Base Models and Relations:
- Problems: one to Many Numerals(Questions)
- Numerals: zero to Many Steps
- Steps:
Requirements:
- Numerals and Steps now have a choice which where described above, currently just 3 choices but it's expected to grow. only one and at least one choice
- Depending on the choice the numeral or step will add new attributes
- for now choices do not have any relations between them
- choice in Numeral could be different to the choice in Step and vice versa (see Example)
Example: did my best to make a close abstraction to show the idea, I've made a JSON Structure, almost identical to the structure used in my front end. Maybe the example is a little bit dumb but it gives a taste.
My Attempts:
For simplicity , Models are shown with the minimum attributes
- 1st:
-My concerning with this approach is that the amount of FK in the Numeral and Step model. Since only one choice is possible, i would have choices-1 amount of useless columns per record
- 2nd:
-I'm programmatically checking for the choice field in the Numeral or the Step, and then using the proper table to fill the data, this second attempt started as a solution to fix the empty columns in the 1st approach

-My concerning with this approach is that the amount of tables will grown 2*n, since for each choice i need to create the corresponding table for step or numeral

I'm open to suggestions or ideas about this attempts.

Edit:
Thanks to @J.D for suggesting this idea. i've modified a bit because i need to keep track of the choices and their fields somewhere somewhere. I think this solution is easier to mantain and removes the need to add tables in case of new choices

3rd:

Best Answer

As Colin touched on, it's hard to follow what schema problem you're trying to solve with just generalities. Unfortunately the devil's in the details with this kind of thing, since one detail could drastically change the best approach.

Looking at your database diagrams at a high level the tables A, B, and C in the first diagram look like the same entity in regards to their purpose, but varies in value of their columns, most likely?...same when comparing tables ANumeral, BNumeral, CNumeral to each other, or when comparing AStep, BStep, and CStep to each other, in your second diagram. The question is why couldn't each of these sets of tables be just one table each (or possibly even one table altogether)?...e.g. you likely could have just a single StepAttributes table that Step joins to.

I did notice in your example the only difference between two tables of the same set are the attribute data types. If that's your reasoning for making multiple tables, instead just structure your single attribute table (e.g. StepAttribute) to account for that which can be done two ways.

The first way by having three columns, one being the AttributeName, the next being the AttributeValue (which will be a string type field, like VARCHAR, to accommodate any value), and the third being the AttributeDataType which stores the native data type of the attribute so you know what to appropriately cast it to when you consume it.

The second way is you can have a denormalized attribute table that has a nullable column of each data type with generic column names, such as DateAttributeValue, IntAttributeValue, VarCharAtrributeValue etc, and a column to specify which data type the value is / column it's stored in. Some systems even design such a table like DateAttributeValue1, DateAttributeValue2, DateAttributeValue3, IntAttributeValue1, IntAttributeValue2, etc so they can support storing multiple attributes within the same row.

Thanks for updating your question with more details. So at the least it looks like each sub-table to Numeral vs Step are a bit redundant in your second design, and I understand why you normalized them that way based on your first design. You could actually combine the best of both worlds though and just have a single Plot, Simplify, and Solve table with the foreign key reference field in those tables going back to your Numeral and Step tables by storing both IDs in that column and calling it something like NuneralOrStepId. Then you'd need an additional field on those tables called like IsForNumeral or ParentType (come up with a better name than mine lol) which would distinguish which base table it joins back to. Then you have no waste in fields and reduce your redundancy as well.

It depends on how many of these sub-tables you anticipate having to generate and manage (right now 3 is completely manageable) but if you think that's going to grow into over 10 tables or so, I'd still consider my first suggestion with one generic table of AttributeName / AttributeValue pairs. You can add any additional columns to this structure that make sense as well, such as the ParentType column to distinguish which parent table the Attribute belongs to.

Related Solutions

Database design advice

                                                  +------------ --- ---+
                                                  | Assembly options   |
+------------+------------+----------+------------+---+---+---+ --- +--+
| assembly ▼ | unit cost  | quantity | total cost | 1 | 2 | 3 |     |50|
+------------+------------+----------+------------+---+---+---+ --- +--+
| VSD55      | £10'000    | 2        | £20'000    | 1 | 1 |   |     |  |

If somebody handed that quote to me, my first question would be "What's option 1 for the VSD55?" The answer would be "I don't know." That information isn't on the quote. In the unlikely event that person got to field a second question, that question would be "What does it cost?" Again, the answer would be "I don't know." A very disturbing silence would follow immediately, during which the person who handed me the quote would imagine how much better it might feel to be run over by a train.

Options must be line items on the quote, along with their unit price, quantity, and total price. Options must be named, not numbered. They should appear directly under their parent assembly, too, not scattered all over hell and half of Georgia.

If you want a shot at my money, you'd better make it crystal clear what I'm supposed to be getting for my money.

There's nothing (much) wrong with 50 check boxes on a user-interface form. That makes it easy to pick options. But the UI code should read the checkboxes and insert the right information into normalized tables.

Ms-access ODBC calls for linked table behaviour

The only way to get complete control over the query for ODBC is to create them as pass-through queries.

In that mode, Access will not touch them and the SQL will be passed verbatim to the server (so your queries will need to be written in that server's particular SQL syntax or you'll get errors).

In ODBC mode, Access will decompose queries that are bound to controls into batches so it never returns more results than it need to display (that is what you see: it will decompose queries based on the primary key to return specific records only).

So you have to be careful of what you wish: if you optimise your queries using pass-through, you will get less calls, but all the data will be returned.

On the other hand, if you leave Access do the querying for you, it may be chattier, but the net performance may be better since less data may be returned more quickly.

You need to test both and choose what works best depending on the exact query and how it is used.

Best Answer

Related Solutions

Database design advice

Ms-access ODBC calls for linked table behaviour

Related Question