Database Design – One table for different types of transactions

database-designrelational-theory

Considering a procurement system as an example. Would it make sense to have one single table for the following: Quote, Order, Invoice and Credit Memo?

There will be some details that are different. For example each list will have its 5 special fields but the other say 50 fields would be exactly the same. Would it make sense in this case to join all tables in a single one. And some fields are unused for some types.

I can see that Navision system does something like that in its database. It uses one table, but then splits into separate tables once the items are posted. I think this may be for performance reasons because the posted items can tend to become very large. & splitting into separate tables would speed up queries. But this is just my thinking.

So my main dilemma is … what is the recommended approach in such case: 1 table or separate tables? what are the pros and cons of each? What are the business considerations to take into account?

Thanks for your help!

Best Answer

There are essentially three options (with the possibility of complexity via combination of the three options):

  1. Keep each table separate
  2. Bring all of the tables together as one major table
  3. Use an associative relationship: a base table with the common columns, and individual tables for the non-common columns

Here are some of the factors to consider:

  • Do these really represent the same entity? If so, what is that entity? Based on your examples (Quote, Order, Invoice, and Credit Memo), these are separate entities to me. They are 'related' but are not the same or subtypes of a master entity (such as having an Employee table and ExemptEmployee and NonExemptEmployee subtables which have fundamentally different attributes). This tilts in favor of keeping tables separate.
  • Do you need to query against Quote UNION Order UNION Invoice? Note that this is different from joining Quotes to Orders (to get, say, a percentage of successful sales). If you do need to do those types of queries often, that argues in favor of a single table. I could see doing this if you're looking at "the life of an order." On the other hand, that's still a simple query even when split out into several tables, so it's a minor consideration.
  • Are there columns which are not common to all of the types? You mentioned that there are several special fields per type. This means you probably do not want to go with option #2 in any case, because option #3 re-introduces relatively simple integrity constraints. Suppose that, in your big table, you have three columns which are required if you are doing an Order, but should be NULL otherwise. Doing this is trivial with associative tables, but requires check constraints with the One Big Table approach. And if column X for an Order needs to be numeric but for an Invoice needs to be text, that just makes validation way more difficult.
  • Do you ever change types? Not just copying Quote data into an Order, but actually changing the Quote itself to an Order. If so, that's an argument in favor of one big table. But I don't think that's a realistic business process.
  • How many orders do you normally get? The more orders you get, the more you want to split out those tables. A scan against X pages is still better than a scan against 4X pages, and separate tables increases the likelihood that you'll be able to do seeks instead of scans because of finer-grained control of indexes.

Based on these considerations, I would personally keep the tables separate. Even though they may have similar attributes, they are fundamentally different entities.

What I would try to do, however, is look at those sets of common attributes and see if they really belong on the Quote/Order/Invoice/Credit Memo, or if normalizing those out into their own tables would be a better idea. For example, if you have customer first name, customer last name, customer street 1, etc., splitting that out into a customer key (or customer and address keys, depending upon circumstances) could cut down on the total number of columns in each of the four major entities, reducing repetition and gaining the other benefits of normalization.

It's possible that you've already normalized everything out to the fullest extent and that each of those entities does, in fact, legitimately have 50 attributes which are in common. In that case, I'd still keep separate tables because you're talking about fundamentally different things which happen to relate to one another.