Sql-server – Update Table from SP in SSIS

sql serversql server 2014ssist-sqlupsert

I know there are similar questions to this one, but I could not find one that answered my question.

Inside my current SSIS package…
I have a dataflow task which…
Takes output from a stored procedure (contains about 10k rows) on the db
Inserts that output into a table in the dbo

The problem is that I have 2 choices of doing this (otherwise I break primary key restrictions)

1.) Truncate the table before I insert data to it

this is problematic because what if the SP fails for some reason? Other procedures use this table and depend on it to have data.

2.)Use "Update" command to update 1 row at a time for ~10k rows

This ties up resources on our server, which is already doing some heavy lifting just to execute the stored procedure (yes, it is a poorly written SP, but I digress).
I looked into doing a row count then checking the count variable before truncating, but I couldn't figure out how to make that work inside a dataflow with data stemming from a stored procedure.

So there HAS to be a way I don't know of to easily take my data and ONLY update existing records or insert new records.

P.S. SQL Server 2014

Current SSIS:
Current SSIS

OLE DB Source = EXEC somesp.sp

|

Count = Count rows (used later on, irrelevant to task at hand)

|

Sort = Sorts for data integrity to ensure nothing will break my primary key (data coming from the sp)

|

Destination = Place records from sp result into a destination table

Best Answer

What you described is a classic UPSERT pattern, named after the dual intention of Updating some records and Inserting others.

T-SQL offers the MERGE statement which can be used and has good performance. Not always the easiest to write/maintain.

SSIS options:

1) UPSERT via SSIS - Essentially what you do is identify your update rows and store them in a temp table so that you can handle Updates via an Execute SQL task in the SSIS Control Flow instead of the row by row OLEDB command in the data flow. Essential needs are to ID your update rows and send them to the TMP table destination. Add an Execute SQL task after your data flow and update as a set. You will see huge performance increase over the OLEDB command task and this may be all you need. Handle inserts as normal into the destination table. enter image description here Source Query vs Existing table join if existing then determine if an update is needed otherwise insert. You can either blindly update or in my case I have a quick join if they exist, and another check to see if there is actually an update... you can do this all in a single step depending on your data/join conditions.

2) DELETE/INSERT via SSIS - Very similar to #1, but instead of performing an update you use two SQL tasks to first delete the update rows then insert new versions of them. All non-update rows get written as normal inserts in your data flow.