Sql-server – Change Tracking across several Tables

change-data-capturechange-trackingsql serversql-server-2008-r2

I'm looking into providing DB level change tracking within my system. I need to be able to track changes at an entity level, rather than just individual tables.

Right now, we have triggers on each of the tables of interest, that write the tables PK into a Change_Event table. We can then query that table, and boil it down to the PK that represents the entity as a whole. We can then retrieve the data for that entity and perform the relevant actions upon it.

As an example, consider the following (simplified) example:

CREATE TABLE Employee
(
    Id INT IDENTITY PRIMARY KEY,
    Name VARCHAR(250) NOT NULL,
    Telephone VARCHAR(15) NOT Null
);

CREATE Table Visit
(
    Id INT IDENTITY PRIMARY KEY,
    VisitDate DATETIME2 Not NULL,
    [Address] VARCHAR(500) NOT NULL,
    VisitorId INT NOT NULL,
    CONSTRAINT FK_Visit_VisitorId FOREIGN KEY (VisitorId) REFERENCES Employee (Id)
);

INSERT into Employee (Name, Telephone)
VALUES ('John Doe', '123456789'),
('Jane Smith', '999555333');

INSERT INTO Visit (VisitDate, Address, VisitorId)
VALUES (SYSDATETIME(), '123 Fake St', 1),
(GETDATE() + 5, '99 Test Av', 2);

In this example, the Visit Entity is considered to be the following xml:

SELECT Id,
       CAST((
           SELECT Id,
                  VisitDate,
                  Address,
                  (
                      SELECT E.Id,
                             E.Name,
                             E.Telephone
                      FROM Employee E
                      WHERE E.Id = V.VisitorId
                      FOR XML RAW ('Visitor'), TYPE, ELEMENTS
                  )
                  Visitor
           FROM Visit V
           WHERE V.Id = Visit.Id
           FOR XML RAW ('Visit'), ELEMENTS
       ) AS XML) AS Data
FROM Visit

I need to be able to know, when one of those entities has changed. If I were to change the telephone number of the Employee, I need to be able to see that that has changed the Visit entity so that I can reprocess it.


Right now, my change table records the Employee Id. I then run a query that gives me all the visits with that employee as the VisitorId. That sounds fine when you are only looking at two tables, but when you factor in several tables (and potentially several levels of abstraction between them) the performance can get pretty slow.

I've looked into Change Data Capture, but that still seems to be capturing at a table level. Equally, I could add Modified date columns to all tables, and combine all the modifieds from the view to produce a single, max modified value – but considering I need to filter on that field, I can't imagine performance would be that great either.

Is there a recommended method of dealing with this within SQL Server? As an added consideration – whilst the Instance is SQL 2008, the DB is still in 2000 compatibility mode.

UPDATE

I've been looking into Change Tracking, which works pretty well at a table level. Unfortunately it doesn't work too well with the select above.

I even tried creating a schemabound view based on that select, and enabling change tracking on the view, but this appears to be an option that can't apply to views.

I can make it work by using something like the below:

WITH ChangedVisits AS
(
    SELECT DISTINCT OV.Id
    FROM dbo.Visit OV
        INNER JOIN Employee E ON OV.VisitorId = E.Id
        LEFT JOIN CHANGETABLE(CHANGES Visit, @LastSyncVersion) AS VCT ON OV.Id = VCT.Id
        LEFT JOIN CHANGETABLE(CHANGES Employee, @LastSyncVersion) AS ECT ON OV.VisitorId = ECT.Id
    WHERE VCT.Id IS NOT NULL
        OR ECT.Id IS NOT NULL
)
SELECT Id,
       CAST(
       (
           SELECT V.Id as [@Id],
                  V.VisitDate,
                  Address,
                  (
                      SELECT E.Id,
                             E.Name,
                             E.Telephone
                      FROM dbo.Employee E
                      WHERE E.Id = V.VisitorId
                      FOR XML PATH(''), TYPE, ELEMENTS
                  )
                  Visitor
           FROM dbo.Visit V
           WHERE V.Id = CV.Id
           FOR XML PATH ('Visit'), ELEMENTS
       )
       AS XML) AS Data
FROM ChangedVisits CV

But the drawback of that is the need to join on each of the tables and an associated CHANGETABLE object in order to work out if something has changed.

Best Answer

My plan for this, is to add a rowversion to each of the source tables, and display the max rowversion from the source tables.

This is making the assumption that rowversion is unique across the entire database and is not incremented in each table (I need to check if that assumption is correct).

SELECT 
 ft.[col1]
,ft.[col2]
,ft.[col3]
,ft.[col4]
,dt.dim_id
,dt.name
, (SELECT MAX(rv) 
   FROM (VALUES (ft.row_ver), (dt.row_ver)) AS value(rv)) AS max_row_ver
FROM [dbo].[myFirstTable] ft
INNER JOIN [dbo].[dim_table1] dt ON ft.[col5_fk] = dt.dim_id

Obviously this only gives you an identifier to use to track changes, you would still need to create a table to track it.