Sql-server – SSIS vs T-SQL Join Performance, Difference Databases on Same server

performanceperformance-tuningsql serversql-server-2016ssis

We are researching the performance difference between SSIS Lookup and T-SQL Join. We want to join two tables; tables are in the same SQL server instance, different databases.

I suspect for small table joins, the difference is minimal or negligible. In this case, our team prefers T-SQL, easier to code/script than writing diagrams. Additionally in the DevOps perspective, we can compile/build scripts in a DB project; unfortunately, SSIS will not compile T-SQL correctly, I can write 'testabcd' in an SSIS Execute SQL statement, and the project will still build/compile.

However, for a large number of rows which will take longer processing, what is quicker? T-SQL which has indexes and statistics, or SSIS which is conducted all in memory?

I read these articles with different viewpoints, the team is trying to gain consensus.

https://derekdb.wordpress.com/2012/03/13/ssis-lookup-or-t-sql-join/
http://www.sqlservercentral.com/blogs/jamesserra/2011/08/29/when-to-use-t_2D00_sql-or-ssis-for-etl/

Let's assume, T-SQL Engine and SSIS are provided same hardware: CPU and Memory. Given the same specs, I would like to know performance speed on an internal algorithm perspective.

Best Answer

Given the same specs, would like to know performance

If your data is on a single SQL Server instance, TSQL joins should always be faster than an SSIS lookup. This is not really close. A TSQL query will leverage indexes and statistics, and the cost-based query optimizer. The join will be performed inside the SQL Server instance, where it is not only more efficient, but it has greater memory and disk resources than an SSIS pipeline.

SSIS lookups are primarily intended for destination-side lookups, heterogenous data scenarios, and for sources without a query processing engine (like flat files).

Further, when your data source is a single SQL Server instance, and your destination is a single (possibly different) SQL Server instance, personally I would almost never use any SSIS data flow transformations. Instead always using source-system SQL for extract-side transformation, and stage-and-merge for destination-side transformation.