Sql-server – When to use CDC to track history

change-data-capturedata-versioningsql server

SQL Server Change Data Capture is a feature that reads historical data from the SQL Server transaction logs and stores them in a special table.

Through the use of special table value functions (TVF) it then allows the user to query this data, making it either possible to get all the changes on a specific table or only the net changes that resulted from the changes within a specific time.

CDC has certain advantages

  • It can be configured to only track certain tables or columns.
  • It is able to handle model changes to a certain degree.
  • It does not affect performance as heavily as triggers because it works with the transaction logs.
  • It is easily enabled/disabled and does not require additional columns on the table that should be tracked.

It also has some disadvantages:

I have read quite a lot about CDC and while I know now how to use it, I am still not sure if it is the right tool for me.

  1. For which tasks/scenarios is CDC the right tool? (e.g. Allowing users to restore a data object to a certain point in time? Auditing? Showing the complete history of data?)
  2. When should you rather not use CDC, but resort to a custom trigger-based solution?
  3. Is it ok to use CDC in an operational database and make use of the CDC data within an operational application? (e.g. showing it to the end user) Or is this clearly a misuse of this feature?

I commonly hear that CDC is an audit tool, but isnt that what SQL Server Audit is for? Are they both different tools for the same task? Or can CDC used for other things?

My current scenario is that I am asked to build a reliable data framework which is supposed to be the basis for multiple future applications. The exact requirements are blurry, but one is that it should be able to track data history and restore older entries together with all related data from other tables. I am evaluating CDC right now as an option, but am uncertain if this is the way to go, because I can't really find any recommended use cases.

While I appreciate advice for my specific scenario, answers should give general advice about when or when not to use Change Data Capture.

Best Answer

Firstly,

Change data capture is available only on the Enterprise, Developer, and Evaluation editions of SQL Server.

So that may decide for you if any of your customers will not have the enterprise editions, or you don't yet know you will be using the enterprise editions. (As the spec includes "multiple future applications" this may be an real issue for you)

Unlike triggers it is not real time, this is both an advantage and a disadvantage. Using triggers always slow down an update.

I worked on one system when we used triggers (generated by CodeSmith), as well as tracking all the changes to the records, we also linked the changes together to a “history” table that included the module of the application that made the change, and the UI item the user used to make the change.

However you may be best solving this at the application level, by say writing all update to a message queue that is then replayed to create a database at any given point of time, see Temporal Patterns on Martin Flowler blog for a good overview of options.