Sql-server – Capturing datetime of change in SQL Server CDC

change-data-capturesql-server-2008system-tables

So we've started exploring using change data capture on one of our production databases. We'd like to know the datetime of each change. Reading through walkthrough's and tutorials etc it seems that the standard approach is to use the LSN to relate to the cdc.lsn_time_mapping system table. This approach works but is not very straightforward nor performant when talking about 100's of thousands of changes a day.

In a test environment I made the following adjustment to the change track tables. I issued an ALTER TABLE statement to add a column to the end called [__ChangeDateTime] and made it's default value GetDate(). The approach seems to work, the change tracking still functions normally, the datetime's are being captured. But mucking around with system tables makes me a little nervous.

If this isn't a system field that Microsoft added from the beginning they must of had their reasons. Since they instead opted for the LSN to cdc.lsn_time_mapping approach am I setting myself up for problems by creating my own hack this way?

UPDATE:

Discovered during testing that GetDate() at times isn't precise enough for our needs – multiple changes sharing the same time. Recommend using sysdatetime() and datetime2 to move the value out to the nanosecond. Option for 2008+ only obviously.

Best Answer

Remember that CDC uses a log reader agent to populate the change table. Why is that important? By that mechanism, rows show up in the change tables asynchronously to the changes made in the base tables.

There are actually 3 different time points that can be recorded, in reverse chronological order:

  1. The time the change was delivered to the change table (which is what you're recording).
  2. The time the transaction that contained the change committed (using cdc.lsn_time_mapping).
  3. The time with which you manually populate a column in the base table (using a default constraint, a trigger, etc.).

So the first thing is to be clear on what you want to record. Usually we would care about either #2 or #3.

If the LSN mapping mechanism (#2) isn't performing well enough for you, the only supported alternative is to add a column to the base table and populate it yourself (#3).

With regards to changing the internal tables, as a matter of policy, I think it's best to avoid hacking around with internals when there are supported alternatives. The very last thing you want is an important production system going down, needing to call Product Support, and being denied service because of something like this. Never mind the issues of it potentially breaking things (upgrades), or being broken because it was unexpected (turn CDC off, then on again, as mentioned in the other answer).