Sql-server – Database Design guidance required

database-designsql serversql-server-2008sql-server-2008-r2

I have scenario where there is Table T with columns PK,A, B, C,D

TABLE t
(
PK,
A,
B,
C,
D
)

i have to insert data from different sources ie columns A,B,C,D are calculated separately..
and they same primary PK.

I would like to ask if it better to create indivdiaul tables for each column and then join them using PK and insert into final table T

pk|A ,pk|B ,PK|C ,PK|D —-> JOIN AND INSERT INTO TABLE T(pk,a,b,c,d)

or

it is ok to update ie

first have some thing like T (PK,A,B,C,D)
DATA SOMETHING like (1,a1,NULL,NULL,NULL)

run update statement for all columns B,C,D sequentially and get all the data

T(1,a1,b1,c1,d1)

Best Answer

I'm not sure either of these are really great options. Here's my thoughts:

pk|A ,pk|B ,PK|C ,PK|D ----> JOIN AND INSERT INTO TABLE T(pk,a,b,c,d)

With this scenario you have redundant data. Unless your individual PK tables are just staging tables, I think you could run into some serious issues here, especially with different consumers pulling data from different tables. I wouldn't chance that whatsoever with this design.

Not to mention that if PK:A, PK:B, PK:C, and PK:D are all 1:1 relationships, I don't see any normalization benefit out of this either.

T (PK,A,B,C,D) DATA SOMETHING like (1,a1,NULL,NULL,NULL) run update statement for all columns B,C,D sequentially and get all the data T(1,a1,b1,c1,d1)

I'm not sure I like this design either, although I'd put it at a higher preference then the former. The reason behind this is because what happens if business dictates that columns A, B, C, and/or D really aren't NULLable. Now you can run into a big problem of data conflicting with business constraints.

I think the real solution here may be to take a step back and analyze your data modification. Instead of having a step-based data insert, is there no way to do a single insert of all the required and necessary data into this table? In the application, retrieve all of the required data before each item's insert. That way you won't be defying any business rules, having redundant data, or additional operation overhead of the designs above.

In other words, if data columns A, B, and C are required, then you shouldn't be pushing that data out to this table until you have it. Without all of those required columns you wouldn't have compliant data, and it shouldn't live in that table until it is. And if D is nullable and optional data, then sure that's fine if it doesn't appear until a later time with a different DML operation.