DECLARE @t1 TABLE (T1_key int, T1_Data char(1), T1_ValidUntil datetime)
DECLARE @t2 TABLE (T2_key int, T2_Data char(1), T2_ValidUntil datetime)
INSERT @t1 VALUES (1, 'A', '2000-01-01')
INSERT @t1 VALUES (1, 'B', '2000-06-30')
INSERT @t1 VALUES (2, 'C', '2005-05-31')
INSERT @t1 VALUES (3, 'D', '2004-12-31')
INSERT @t1 VALUES (3, 'E', '2007-04-30')
INSERT @t1 VALUES (3, 'F', '2008-01-31')
INSERT @t2 VALUES (1, 'R', '2002-03-31')
INSERT @t2 VALUES (2, 'S', '2001-06-30')
INSERT @t2 VALUES (2, 'T', '2003-02-28')
INSERT @t2 VALUES (2, 'U', '2005-05-31')
INSERT @t2 VALUES (3, 'V', '2006-09-30')
INSERT @t2 VALUES (3, 'W', '2007-06-30')
SELECT
T1.*, T2x.*
FROM
@t1 T1
CROSS APPLY
(SELECT TOP 1*
FROM @t2
WHERE T1_key = T2_key AND T2_ValidUntil >= T1_ValidUntil
ORDER BY T2_ValidUntil
) T2x
UNION
SELECT
T1x.*, T2.*
FROM
@t2 T2
CROSS APPLY
(SELECT TOP 1*
FROM @t1
WHERE T1_key = T2_key AND T1_ValidUntil >= T2_ValidUntil
ORDER BY T1_ValidUntil
) T1x
I would say yes, create the surrogate key of TransactionID. It would create a narrow, unique identifier across the table.
One of the reasons why a surrogate key would be best is because of relationships with other tables. If you need to relate the Transaction table with another table (Line_Item_Detail for instance) the entire primary key of the Transaction table would need to be in the related table as a foreign key. If you decided to use your candidate key of Country ID, Store Number, POS Terminal Number, Transaction Date, and Item Code these columns would need to be in every related table. If you would need to update any of these fields they would need to be updated in the related tables too. This gets messy very quickly. With a surrogate key of TransactionID you would only need to add the TransactionID column to your related table. Since this will, should be, and IDENTITY
column we should never have to update it.
Another thing you should be thinking about when creating SQL Server tables is the clustered index. A clustered index is how SQL Server physically stores the data for a table. You can have a table without a clustered index called a heap. In most cases you really should create a clustered index (read more about heaps vs. clustered indexes) Michelle Ufford has an excellent post on creating Effective Clustered Indexes. In short, your clustered indexes should be:
- Narrow – as narrow as possible, in terms of the number of bytes it
stores
- Unique – to avoid the need for SQL Server to add a
"uniqueifier" to duplicate key values
- Static – ideally, never updated
- Ever-increasing – to avoid fragmentation and improve write
performance
A clustered index on TransactionID would fill this criteria nicely.
Since you'll be adding a surrogate key to the Transaction table you should also consider adding an alternate key of Country ID, Store Number, POS Terminal Number, Transaction Date, and Item Code or some other candidate key. This will guarantee there will be no accidental duplicates added to the table. If you don't add the alternate key there is a strong probability that duplicates will creep into the table.
One last thing, you may want to consider normalizing the table further. With the table design your provided if more than one item is added to a transaction you will be adding duplicate data to every row (Store Number, POS Terminal Number, etc). Querying will also become more difficult due to the duplication of data.
Best Answer
I have a couple of questions:
1) Do you need to keep a detailed history of changes? Or only who/when was the record created and who/when last updated it?
2) Assuming no detailed history is needed, the main question I have is whether your alternate solution, of using a separate tracking table(s), will be any easier to maintain than adding the columns to every table. You will either need to: a) add code into every query/stored procedure that manipulates data to update the tracking table; b) add a trigger to every table that needs to be tracked.
3) How often do you expect to query this data? My experience with this type of auditing is you only need it when something bad has happened.
My impression of your options are:
1) I think using the Object_id for the tables as your FOREIGN TABLE ID is a good idea. Why would this change? Do you expect to drop and recreate tables during deployments? And if you do, you can easily map the old object_id with the newly created one and update the tracking table. You could add this logic to your deployments.
2) Wouldn't this negate your reason for not adding the tracking columns into the main tables?
3) See comment for 2), plus I don't think this would work as well as 2).
As far as performance, if you use a separate tracking table, I think it would be best to keep it as separate from the main tables as possible (option 1).
The biggest potential performance boost I can see by using a tracking table is for querying the data. You will be able to tune the indexes specifically for those queries. Also, all your data will be centralised. If the columns are on the base tables additional indexes will probably be an overall disadvantage. BUT!!! Back to my question 3): How often are you going to need to query this data?
As far as day-to-day performance your real impact will come down to how you implement updates of the tracking table. Changes to the stored procedures would be better than triggers in my opinion.
I'm sorry if I haven't really answered your question, but I hope my comments help.