Sql-server – SSIS comparison load between new, updated and deleted records and populate a staging database

etlsql serverssis

So far,

My control flow looks like:

Execute SQL Task
– This executes a statement that creates a staging table. If one exists, it is dropped so everytime the package is run, the staging table will be able to track the updates that have occurred. My code for this is:

IF OBJECT_ID('CDC_Staging', 'U') IS NOT NULL
   DROP TABLE  CDC_Staging;

CREATE TABLE CDC_Staging
(
    [Employee_ID] [int] PRIMARY KEY NOT NULL,
    [FirstName] [nvarchar](255) NULL,
    [LastName] [nvarchar](255) NULL,
    [Education] [nvarchar](255) NULL,
    [Occupation] [nvarchar](255) NULL,
    [YearlyIncome] [float] NULL,
    [Sales] [float] NULL
);

Data Flow Task
– The OLE DB Source is the Source Database. The Lookup uses the results of this query:

SELECT [Employee_ID] FROM [CDC_Target]

Then I use a conditional split that puts inserts in the Target Database and Updates to Staging
enter image description here

My Final control flow task is an execute SQL Task. I used this query to update the target database.

UPDATE [dbo].[CDC_Target]
   SET [FirstName] = Staging.[FirstName]
      ,[LastName] = Staging.[LastName]
      ,[Education] = Staging.[Education]
      ,[Occupation] = Staging.[Occupation]
      ,[YearlyIncome] = Staging.[YearlyIncome]
      ,[Sales] = Staging.[Sales]
FROM [CDC_Target]
   INNER JOIN
     [CDC_Staging] AS Staging 
     ON [CDC_Target].Employee_ID = Staging.Employee_ID

I just inserted 4 more records into my source table. After I ran the package, the Source Table and Target table had the correct data but the staging table did not contain the 4 inserts that I committed.

Am I doing something wrong here? Any suggestions on how to better do this?

PS. I followed a tutorial explaining how to do this, I'm a Jr. DBA but my team lead is wanting me to learn SSIS.

Best Answer

If I've understood correctly your OLE DB Destination is to the 'target' table and your OLE DB Destination 1 is to the 'staging' table? That would explain why your new rows aren't appearing in the 'staging' table as you're only sending them to the 'target' table.

If you want all rows (updated or inserted) to appear in the 'staging' table, I would suggest you remove the Lookup and Conditional Split and simply insert all rows to the 'staging' table. Then use a Execute SQL task to fire a MERGE statement to update or insert as appropriate to the 'target' table. In my experience this is much more performant that using the lookup to decide whether to insert or update