SQL Server – Different Tables with Identical Fields

database-designsql server

My question is for designing something in MSSQL server.

I work at an engineering company that designs embedded systems. Basically, we have 3 different hardware components in a product: one is for communication on a network (GPIB, Ethernet, Modbus…), one is for measurement/control functions, and the third is for miscellaneous stuff (like diagnostics and logging). Each piece of hardware has multiple firmware versions available. Each firmware has a set of registers used for configuration and a set of defaults.

I want to create a database to track the firmware, hardware, default configs, and register maps. I can create three separate tables for firmware, like commHardware, controlHardware, miscHardware, and then create tables that relate to these (i.e., commFirmware with a foreign key to commHardware and commDefaults with a FK to commFirmware).

Or… I can structure my tables so that there's one table for hardware, one for firmware, one for register maps, one for default configs, and one for hardware role (comm/misc/control).

The representation will be the same no matter which option I choose, so I don't have to shoehorn one design to fit the other. Right now, I'm leaning toward option 1, for two reasons.

  1. If I confuse something, it might not work – if I try to download comm firmware 01 onto a control board, it'll be blatantly screwed up. I don't think it is physically possible to do so – the processors are different.
  2. If I confuse something, it might actually work… for a while. Downloading the miscellaneous registers onto a communications board can be done, but it'll screw up communications ("Oh, you wanted to suppress overflow alarms? Too bad, here's a new MAC Address.")

In writing this, I've almost convinced myself, but I want to get a second opinion. Is the first approach superior in a way I'm missing? I'm looking for problems that could cause:

  • Obvious performance issues
  • Data integrity problems
  • Maintenance nightmares
  • Software development headaches
  • Angry future coworkers visiting my house with hatchets, torches, and/or pitchforks in hand.

Best Answer

I think you want to go with your first option. Here's why:

You should design your database with a primary goal of maintaining data integrity. Subtle data corruptions can be harder to debug than source code logic errors. This is because corrupted data may only cause a detectable problem when a series of conditions occur at runtime. Often you can't repeat these reliably.

The best way to focus on data integrity is to create a properly structured database. If your data is going to be edited at any time, then you're best to start with a design that's in Third Normal Form (3NF) at least. In general, you should have a table for each distinct type of thing that you have data about.

Sometimes, you may have two or more different types of things with similar attributes (data items). This puts you in the position of deciding where you make your trade off. Combining the facts about your things into one table means fewer tables. Fewer tables will mean less code as long as the different types of things are and continue to be similar. If you need to start moving in the direction of each type of thing having its own distinct set of attributes, then your code will become more complex and hard to maintain, with lots of branching logic.

It's important to remember that two different tables having the same set of attributes might be coincidental to your system or it might be consequential to your system. Database structure design is not optimized in the same way as code reuse. Cramming logically distinct but structurally similar data into the same table is not a good practice. If you have two or more semantically distinct tables with very similar or even identical structure and your programmer's spidey-senses are tingling too much, then satisfy your urge for code reuse by subclassing in your code, but leave your apples and oranges in their own tables, rather than cramming them into a "fruit" table.

In a case where your two (or more) types of things are different from one another, then having one table for each type of thing is probably an excellent place to start. This will keep your logic simple and your data clean, which makes for a high quality system.