SQL Server Migration – Loop Through Tables or Copy Paste

migrationms accesssql serverssis

I have to do a migration where I should retrieve data from 40 Access tables, which should contain several of the same columns (but not all, the schemas are different). To make it simpler, I used SSMA to convert these tables in SQL Server.

It is part of a migration through SSIS. I have to merge all these tables in one table without duplicates. A portion of the treatment is the same for all the tables but there are some subtleties. One problem is that the column names have slight differences (like X_Y instead of X). And each table have specific columns but I do not want to retrieve the data in those.

I hope it is not too broad a question, but:

Is it better to try to do a generic part, taking the table name as a variable to apply the same treatment with conditions in it to take care of the differences?
Or should I just copy & paste a core/template data flow and manage each difference in separate packages?

Best Answer

Given that:

There are some columns you don't want to pull through for each table
Column names are different for each table
You only have 40 tables
This is a one-time migration (this is safe to assume, correct?)

Then I would just copy and paste the core dataflow and make per-table changes as appropriate. For a non-repeating process that is a small amount of tables (not 200 or more), you don't want to spend too much time creating a repeatable process that will never be repeated ;-). And you are going to have to manage the differences anyway, so you might as well just put your effort into that part.

And it should be possible in SSIS to define a reusable component (script task) to handle some of the "common" manipulations. If not, just create a stored procedure that accepts a table name and handle the common steps there. Then, in SSIS, just call that Stored Procedure for each table and that way you will have a single definition for the common stuff, and if you need to make a slight change in the common changes, then you will only need to make them in one spot :).

^{(please see UPDATE section below)} Another option for handling data migration when you have issues of not pulling all fields over and possibly renaming fields is to use the SqlBulkCopy Class in .NET as it handles those things quite nicely. You can either create a C# Task in SSIS to use this class, or you can use a SQLCLR stored procedure such as DB_BulkCopy which is part of the SQL# SQLCLR library (which I am the author of, but this stored procedure is in the Free version). I describe this in a little more detail in the following answer:

Import data from one Database to another script

UPDATE

The following requirement was added a day after I posted the suggestions above:

I have to merge all these tables in one table without duplicates.

In light of this new requirement, I think that using the SqlBulkCopy alternate option is less attractive than sticking with SSIS in the manner that I described first. The only way to de-dupe the data using SqlBulkCopy / DB_BulkCopy would be to run the process from the destination / target server, load the data into a temp table (it would have to be a global temporary table -- meaning, starting with ##) from the source server using SqlBulkCopy or DB_BulkCopy, and then insert into the real destination table from the global temporary table while filtering out the "duplicates" based on whatever rules you have to determine what a duplicate is.

Best Answer

Related Solutions

ETL – Extracting from 200 Tables: SSIS Data Flow vs Custom T-SQL

SSIS Design considerations

TSQL vs SSIS (my story)

Sql-server – the fastest method to migrate large tables from Oracle to SQL Server

Related Question