How about just a single table with a couple of content columns, something like :
ID | q_id |...|...| shortAnswer varchar | FullAnswer Text |...|...|
You can then place an unclustered index on the shortAnswer, and write a crud procedure which trims the user's answer for the shortAnswer field, while inserting the whole answer in the FullAnswer field
A coworker was suggesting that it is correct for a table to store 'description' and 'notes' as a foreign key rather than in the table for performance reasons, given that some of the records will contain NULL (or empty string in this case), and presumably will be split across pages.
Possibly, depending on RDBMS, the exact data types, and what the performance goal is.
For SQL Server, which it sounds like you're using, you can use sp_tableoption
with large value types out of row
to keep a LOB (nvarchar(MAX)
in this case) column in the table you want (logically), while pushing the bulk of the data storage onto separate pages (physically).
This will make accessing the LOB data slower, but speed up access to only the base data (probably moreso for scanning) as the tradeoff. Depending on the average LOB size, this setting may not make a significant difference. You will have to test this for your exact scenario to see if it improves your workload.
In any event, as long as you want to keep the one-to-one relationship, there's no reason to logically separate the values from the base table because there is an available way around the problem in SQL Server, assuming you're using the MAX
type variant. (Note: if you're currently using a fixed-length field, you could switch to the MAX
type and add a check constraint to limit the length.)
This conversation occurred after we noticed that the integrity of the one-to-one table design was broken with a duplicate record.
It seems like the design of these structures is smelly. Are you keeping a single table to store all of these fields, which are then referenced from the parent tables?
A better design would be to only separate the tables where there's an actual need (again, as I mentioned, this may not even be required) with a declarative one-to-one relationship, such as repeating the primary key of the parent table as the primary key in the LOB storage table.
It seems to me that if the developer who designed the column put it in the source table, all of the code that handles the one-to-one integrity would be unnecessary (and would have not caused this bug)
Quite possibly. The fewer moving parts there are in a system, the more likely it is to be reliable, probably with fewer bugs as a result of complexity.
Best Answer
At the end of the day, #3 is still the BEST option. We go with what we think is simple at that point but more often than not, business will come up with another reason to add one more type of address.
Design it correctly from the get-go! Good luck!