I am creating a cms system where the user can create fields to use in the templates of the website. I want to create different types like 'string', 'html', 'integer' etc. The user must also have the possibility to add these fields in different values, so I will be needing at least two tables.
My question is this: Should I create a table for each field value type and catch this in the code or should I store all field value types in one table?
Option 1:
Table for each type
+---------------+ +-----------------+ +--------------------+
| Fields | | field_int_value | | field_string_value | etc...
+---------------+ +-----------------+ +--------------------+
| id | | field_id | | field_id |
| field_type_id | | language_id | | language_id |
+---------------+ | value (int) | | value (varchar) |
+-----------------+ +--------------------+
Option 2:
Store all values in one table. keep the unused values NULL
.
+---------------+ +-----------------+
| Fields | | field_value |
+---------------+ +-----------------+
| id | | field_id |
| field_type_id | | language_id |
+---------------+ | value_int |
| value_string |
| etc... |
+-----------------+
Some examples:
A user could for example create a string with his or her phone number. In the code the template would call this field to display the phone number. If the user would ever change this, they can just edit the value in the admin without having to worry about editing the template.
Another example is a boolean to allow the user to show a certain 'call to action' on a page. This boolean would be used to hide something in the template without having to update the template.
My Field model would be the one handling all this. In here I would create functions to get the field of a specific type.
I have been breaking my head over what would be the best option, but I can't seem to find the better one. Does anyone have any ideas?
Best Answer
Not discounting @gbn's suggestion (in a comment on the question) to possibly store this data somewhere other than in a RDBMS, I will say that if you do decide to go the RDBMS route, you are better off using a single table with a column per each "type" such that they can be strongly-typed. Or, you could use a single table with a single string field as all data can be converted/serialized to a string, but then you have to be careful to do the conversion properly in both directions for all operations, and that might prove problematic if some uses of that table either incorrectly convert or forget to convert altogether.
The reason for choosing a single table is mostly logistics: it is far easier to dynamically select a different column based on a COALESCE or CASE (or CASE-like) clause than it is to dynamically pull from different tables in a query based on the TypeID of the row (at least not without having to LEFT JOIN all "type" tables in all queries (yikes!).
Depending on which RDBMS you are using, there might even be vendor-specific features to assist in this model. For example, SQL Server, starting in version 2008, I believe, has a
SPARSE
option that can be added to NULLable columns. This allows for a NULL value to take up 0 bytes whereas normally, for fixed-length fields (i.e. pretty much all of them that aren't[N]VAR{something}
or{something} VARYING
orXML
or genericallytext
orSQL_VARIANT
), they would always take up their specified number of bytes. The downside is that for non-NULL values, they now take up 4 extra bytes per row. But it can be a huge savings when the column is 75% or more NULL across all rows. This feature was designed specifically to address this type of model.Just make sure to have a
CHECK CONSTRAINT
to ensure that for all of thevalue_
fields, either none, or only one, of them isNOT NULL
.Then you can access via something like:
And again, if you are using SQL Server (and maybe one or more other vendors have this feature), you can make use of "filtered indexes" to index each specific
value_
field where that value is not null:This will allow you to target rows that do have values, although probably not when using the generic
COALESCE
access method, but when targeting a specificValue%
field.AND, looking again at the two options, I would actually go with Option 3, which is a single table. In Option 2 it does not seem like there is much purpose in breaking
field_type_id
out as that data is not really a parent to thefield_value
data. Afield_id
can only have a single value, right? So just putfield_type_id
in thefield_value
table. And make use of it in the CHECK CONSTRAINT that ensures a single field at most is NOT NULL, to ensure that the one field, if any, that is NOT NULL matches thefield_type_id
.