Sql-server – Get all changed tables from CDC

change-data-capturesql server

I am polling CDC tables to synchronize data between SQL Server and MongoDB. The approach I have taken is below:

  1. Enabled cdc for dbo.table1 and dbo.table2 which creates cdc.dbo_table1_CT and cdc.dbo.table2_CT
  2. Created an insert trigger for cdc.dbo_table1_CT and cdc.dbo.table2_CT. Then the trigger will writes the changed table's name and schema to another table, say tbl_consolidated_tracks
  3. Then I have a service which polls tbl_consolidated_tracks and then from that the service reads the changes and syncs to mongodb

What I am trying to do here is to avoid triggers and the usage of tbl_consolidated_tracks.

Can I directly poll the changed tables from any cdc procedure or table?

Best Answer

Instead of using triggers on the CDC tables, you should use the built-in CDC functions for returning the changes recorded by CDC. If you only need the latest value for each row, use cdc.fn_cdc_get_net_changes but if you need all intermediate changes as well, use cdc.fn_cdc_get_all_changes.

These functions are created for each table (or 'capture instance') that you enable CDC on and the functions return changes on a table from the earliest LSN to the latest LSN. There is a good article here that delves into CDC and retrieving the data.

To use it in your periodic polling process, you could do something like this:

  1. Fetch the earliest LSN from cdc.lsn_time_mapping after the last datetime that you polled.
  2. For each table, fetch the max lsn value using sys.fn_cdc_get_max_lsn
  3. Use cdc.fn_cdc_get_net_changes or cdc.fn_cdc_get_all_changes with the LSNs from step 1 and 2 to fetch the changes that have occurred since the last polling.
  4. Process the changes into MongoDB
  5. Log the min and max LSN and the start and end datetime used during this polling to a table. This can be used on the next poll to get the datetime value to determine your next min LSN.

You could even use SQL Server Integration Services which has built-in tools for working with CDC data.