SSIS Package – Should Linked Server be Avoided by Utilizing JOIN Task?

sql serversql server 2014ssis-2014

We are trying to setup a centralized SSIS server (SQL Server 2014). I want to know the following:

  1. Any procedure which is utilizing Linked Server and if that procedure gets called from SSIS SQL Task, is it better to put that logic inside SSIS itself without utilizing Linked Server?
  2. If we ask end user to put logic inside the SSIS package (i.e. JOIN…) and not use Linked Server, is it better performance wise?

Example: I have Table A from Server_1 and Table B from Server_2. If I am creating a stored procedure on Server_1, normally I will be joining Table A to Table B using Linked Server to Server_2. If I put this same logic inside the SSIS package and use [JOIN] task instead, would that be better?

Best Answer

You'll have to test to find which works best for your scenario.

By "join task" I assume you mean a Merge Join transformation which requires the input to be sorted. Sorting can be expensive especially if using a Sort transformation. Even sorting at source will take time.

You could try a Lookup transformation which works best using full cache, but if the reference data is large the cache will obviously require a lot of memory.

Another option is to use SSIS to localise the table and then do the join as you would do normally.

Rarely would I use linked servers for anything other than ad-hoc queries.