Sql-server – standard language/interface for programmatic ETL in SQL Server

etlsql server

I'm currently in the process of creating ETLs for our data warehouse. We're using SSIS 2008, but we're running into issues, the biggest of which is the difficulty in re-using components. We have separate packages for each table and each package takes as input a number of variables from a parent package. As we make changes to these input variables, we're required to go into each package (we have 15 or so now, but this number is going to grow significantly) and modify the package to deal with those changes. There are also other issues, including the inability to run arbitrary SQL for our extraction, poor logging capabilities, etc.

This entire process would be much more robust if there were a way develop our ETLs in code, enabling code reuse, common libraries, better unit tests, etc. Is there a de facto standard ETL language / API for SQL Server? I'm looking to avoid GUI tools as much as possible.

Edit: I should mention my background. I'm not a DBA and have no formal (or informal) DBA training, I've basically figured this stuff out as I went along, so there is every likelihood that I'm attempting to do inappropriate things with SSIS or approaching this ETL project from the wrong angle. Also, I'm currently employed in state government, so any solutions that require the purchase of a new software package aren't within the realm of possibility.


Here's one of our tasks. We are using a single SSIS Package to load each table in our warehouse. Each Fact package and Dimension package are generally the same, they only differ in

  • Extractions from the source database
  • Manipulations in a Data Flow
  • Merges into the destination table

What I would like to be able to do (that I'm finding to be difficult to do in SSIS)

  • Load the extraction query from a text file. When developers are writing and testing their extraction queries, I should not have to manipulate their query in any way before SSIS runs it and I should not have to cut and paste the query into a DB Source object.
  • Test each component individually. I should be able to test the complete ETL process for an individual table in isolation, independent of other table loads.
  • Make modifications to the shared logic in one place, not have to edit each individual package. Every package loads data into the audit tables the same way, if I want to change the data that is loaded audited, I don't want to have to edit all 15 packages (this number is going to get much much larger over time).

The entire process feels like it would be much easier to implement and more robust if done programatically with proper use of shared code.

Best Answer

There is a tool that enables this - http://www.varigence.com/products/biml.html

There's a commerical version, but we also include some of the BIML functionality in BIDS Helper, a free tool. http://bidshelper.codeplex.com/

I'm happy to answer any questions that you might have about it.

This is a tool that my company provides.