There are a few different strategies you could take, where on one end you pursue aggressive normalization to, on the other end, full denormalization. The full denormalization would be equivalent to your second example where all relevant info simply ends up in the transaction table without references to other tables.
Full Normalization
So, to completely normalize, you would still want a Categories
table, but you want to even eliminate the storage of redundant information in this table, so you would need a CategoryList
table and a SubCategoryList
table as
CategoryList
id category
----------------
1 Food
2 Household
etc...
and
SubCategoryList
id subcategory
----------------
1 Work Lunch
2 Fast Food
3 Grocery Store
4 Mortgage
5 Repairs
etc...
You could then construct your category table from these two tables as
Categories
id category_id subcategory_id
----------------------------------
1 1 1
2 1 2
3 1 3
4 2 4
5 2 5
Treatment of NULL
subcategories can easily be handled by either 1) simply placing a NULL entry for the subcategory_id
column in the appropriate row of the Categories
table, or 2) adding a subcategory entry id, subcategory
where the subcategory field is NULL
.
Last but not least you would add a foreigh key reference from your Transactions
table to the appropriate id
in the Categories
table.
Does it really need to be so normalized?
Well, in my opinion, no it doesn't. I've heard a quote, though I can't remember who spoke it, but it basically goes "Normalize until it hurts, denormailize until it works." Especially in the case where you don't have a lot of categories, the fully normalized design may be a little bit of overkill.
What might simply make more sense would be to keep the above mentioned CategoryList
and SubCategoryList
tables to enumerate your types, but skip making the separate Categories
table, and then simply have your Transactions
table referencing the CategoryList
and SubCategoryList
tables as
id txdate amount category subcategory account ...
--------------------------------------------------------------
1 6/25/15 15.25 1 2 cash ...
This way, you save on storage, and you can easily update/modify any category or subcategory entry in the list without needing to modify your entire Transactions
table. Further, you can simply permit the subcategory
column of the Transactions
table to permit NULL
entries, if need be.
Hope this helps!
Best Answer
Assuming the digits to the left of the
+
are measured in units of 100, you could perhaps create constraints like this:However, this doesn't allow dynamic constraints, all rows in the table will be constrained to those measurements, which may in fact be exactly what you need.
If you need dynamically adjusting constraints for each independent row, you could do this:
Selecting data would look like this:
Results, like this:
The post columns have been split into two to enable us to use the correct data types to store numeric data. If we try to store
175+80.50
in a single field, we end up using avarchar(x)
column, which allows all kinds of possibilities for bad data, such astee+27.-1
, which are very difficult to comprehensively prevent. So, we store the station in one column, and the offset from that station in the next column. When presenting this data on screen or on reports, to humans, we'd use the concatenated version shown above.