Sql-server – SSIS Data Source: Embedded Queries, Stored Procedures, or Functions

performanceperformance-tuningsql serversql-server-2016ssisstored-procedures

We are moving data from OLTP server into Data Warehouse server, utilizing SSIS to conduct this.

When writing queries from the OLTP Data Source, it it better to utilize?

1) Inline Embedded queries

2) Functions

3) Or Stored Procedures

In researching and testing – Embedded queries are great, however we want code placed in SSDT Database projects to build/compile. So if column name changes/deleted, we would know in Devops portion. If writing in SSIS Datasource 'Test abcd etc', SSIS will still build/compile. If writing in SQL Database project, 'select testabcd from dbo.exampletable', SSDT DB project will fail to build/compile.

I hear Functions are great, since it will display the metadata. However, Multi Statement table value functions are Slow, which team will require sometime.

Stored procedures I am reading have issues with creating metadata. Is this true ? Is there a workaround, heard its possible to create metadata in stored procedures.

Can someone let me know good option between three, or any other options?


4) * Did not want to incorporate views, since team wants to parameterize queries by source Create/Update Dates, if ever intending to reprocess data.

enter image description here

Best Answer

I would recommend using stored procedures. As far as issues with the metadata, this can occur when temporary tables are used. Using the WITH RESULT SETS option to execute the stored procedure will allow you to define the metadata of the columns instead and SSIS can bypass this step. An example of this follows. Also note that if any column definitions were to change or if new columns are added, this would need to be updated as well.

You may also run into problems with using parameters with an OLE DB source as in your screenshot. However this can be overcome by using the "SQL command from variable" option for the Data Access field. A string variable can be created as an expression that executes the stored procedure using a parameter sent to the package (or project). An example of this is below. This example uses the DT_STR function to cast the numeric parameter to a string in order to concatenate it. This is not necessary for string parameters or course, I only included this in case you use numeric parameters. The WITH RESULT SETS option can also be used in an expression concatenating it with the parameter following the same method.

WITH RESULT SETS

   EXEC ('dbo.YourSP')
    WITH RESULT SETS
      (
        (
            ColumnA INT,
            ColumnB VARCHAR(250),
            ColumnC DATETIME 
        )
      );

Variable Expression using Parameter

"EXEC dbo.YourSP " +  (DT_STR, 10, 1252)@ @[$Package::YourParameter]

Functions Followup Response: If you prefer to use functions and they better suite your needs you can go that route. I tend to prefer stored procedures in general as they offer for flexibility (perform DML/DDL, use temp tables, etc.). If you do use functions, I'd recommend using an inline over a multi-statement table valued version. The inline function is essentially treated like a view and will use the proper statistics to create better execution plans, whereas multi-statement functions won't get these benefits.