Sql-server – Generate insert statements for table AND related data

foreign keyinsertmigrationsql server

I'm looking to move data from tables and related tables between 2 databases. The trouble is the primary keys are out of sync.

Is anyone aware of a tool which can generate insert statements for a given table and it's related data (perhaps walking foreign keys)?

I've been looking at SSMS toolpack, RedGate's tools, Database project data compare, various open source data diff tools, etc etc, but not found anything.

I'm aware of being able to walk foreign keys, so perhaps some amazingly clever SQL script might be able to do it?

Seeing as I've looked at so many tools, I'm guessing it's probably blooming difficult / impossible to do it fully automated. But perhaps a technique that needs some manual intervention as well?

Or perhaps anyone has hints on how they handle this sort of task. I don't really fancy writing lots of manual sql to handle the insert – as it's bound to be error prone.

What I'm really hoping for is a way of generating SQL to tie up the PK and FK's whilst doing the inserts in the new database.

Say for example we had:

+--------------+--------------+
|     personid |     name     |
+--------------+--------------+
|            1 |         bob  |
|            2 |         fred |
+--------------+--------------+

+----------+----------------+----------+
| animalid |     animal     | PersonID |
+----------+----------------+----------+
|        1 |         cat    |        1 |
|        2 |         dog    |        1 |
|        3 |         lizard |        2 |
+----------+----------------+----------+

It would be awesome to copy bob to the new database and bob's cat and dog. Then fred and fred's lizard.

Best Answer

The SSMS Tools Pack allows you to generate insert scripts for a particular table. On the Features page, it's the "Generate Insert statements from resultsets, tables or databases." That won't get all child tables automatically, however.

In an older Stack Overflow post, they mention a few methods for generating inserts. The first method (using SSMS's Generate Scripts functionality) will work if you have SQL 2005 or better. The second method, a downloaded stored procedure, could do the job if you're using 2000.

Assuming you have 2005 or later, right-click on the database and go to Tasks --> Generate Scripts. From there, you can select specific database objects. You'll select all of the necessary tables (so this is the manual process).

To get the list of tables, you are correct about walking the foreign key list. That's at sys.foreign_keys. The following script will get a list of all foreign key dependencies from a central point. Note that this does not get the child records, but only the "parent" records. For example, suppose T3 has a foreign key (T2ID) to T2, and T2 has a foreign key (T1ID) to T1. If we run this script on T3, it will return three tables: T3, T2, and T1. If we run it on T2, it will return two tables: T2 and T1. If we run it on T1, it will return only T1.

declare @SchemaName sysname = 'dbo';
declare @TableName sysname = 't2';

with foreignkeys as
(
    select
        tbl.object_id as ObjectID,
        tbls.name as SchemaName,
        tbl.name as TableName,
        convert(varchar(8000), tbls.name + '.' + tbl.name) as [Path]
    from 
        sys.foreign_keys fk
        inner join sys.tables tbl on fk.parent_object_id = tbl.object_id
        inner join sys.schemas tbls on tbl.schema_id = tbls.schema_id
    where
        tbls.name = @SchemaName
        and tbl.name = @TableName

    UNION ALL

    select
        ref.object_id as ObjectID,
        refs.name as SchemaName,
        ref.name as TableName,
        convert(varchar(8000), fks.[Path] + '/' + refs.name + '.' + ref.name) as [Path]
    from
        sys.foreign_keys fk
        inner join sys.tables ref on fk.referenced_object_id = ref.object_id
        inner join sys.schemas refs on ref.schema_id = refs.schema_id
        inner join foreignkeys fks on fks.objectid = fk.parent_object_id
    where
        fks.objectid <> ref.object_id       
        AND fks.[Path] NOT LIKE '%' + refs.name + '.' + ref.name + '%'
)
select 
    SchemaName + '.' + TableName
from 
    foreignkeys

UNION

select @SchemaName + '.' + @TableName;

If you want to get the child tables as well (i.e., getting the entire foreign key dependency chain), another CTE will do it:

declare @SchemaName sysname = 'dbo';
declare @TableName sysname = 't2';

with foreignkeys as
(
    select
        tbl.object_id as ObjectID,
        tbls.name as SchemaName,
        tbl.name as TableName,
        convert(varchar(8000), tbls.name + '.' + tbl.name) as [Path]
    from 
        sys.foreign_keys fk
        inner join sys.tables tbl on fk.parent_object_id = tbl.object_id
        inner join sys.schemas tbls on tbl.schema_id = tbls.schema_id
    where
        tbls.name = @SchemaName
        and tbl.name = @TableName

    UNION ALL

    select
        ref.object_id as ObjectID,
        refs.name as SchemaName,
        ref.name as TableName,
        convert(varchar(8000), fks.[Path] + '/' + refs.name + '.' + ref.name) as [Path]
    from
        sys.foreign_keys fk
        inner join sys.tables ref on fk.referenced_object_id = ref.object_id
        inner join sys.schemas refs on ref.schema_id = refs.schema_id
        inner join foreignkeys fks on fks.objectid = fk.parent_object_id
    where
        fks.objectid <> ref.object_id       
        AND fks.[Path] NOT LIKE '%' + refs.name + '.' + ref.name + '%'
),
parentsandchildren as
(
    select
        ObjectID,
        SchemaName,
        TableName
    from
        foreignkeys

    UNION ALL

    select 
        tbl.object_id as ObjectID,
        tbls.name as SchemaName,
        tbl.name as TableName
    from
        sys.foreign_keys fk
        inner join sys.tables tbl on fk.parent_object_id = tbl.object_id
        inner join sys.schemas tbls on tbl.schema_id = tbls.schema_id
        inner join parentsandchildren fks on fks.objectid = fk.referenced_object_id
    where
        not exists (select * from foreignkeys fk where fk.objectid = tbl.object_id)

)
select 
    SchemaName + '.' + TableName
from 
    parentsandchildren

UNION

select @SchemaName + '.' + @TableName;

If you run this script on T1, T2, or T3, it will return the list of all three tables (because it traverses the chain in both directions). Both of these scripts are also smart enough to not get caught up in self-joins and loops, should your table structure have either of those.

Running one of these two scripts would get you the list of tables that you could then select in the Generate Scripts menu. Just make sure to set the Type of data to script option to Schema and Data, or else you'll only get the table structure.

EDIT

I don't know of any tool which will script out individual records the way you're asking for them. Even writing this would be difficult at first glance, because you would need to know which records you need to insert, all records which depend upon that, all records which depend upon those, etc. That's a fairly straightforward recursion problem by itself. But then you'd need to maintain a structure telling when each row gets inserted so you don't double-insert records. For an arbitrary table structure (especially one with loops), this becomes a challenge.

But based strictly on your example, I'm not sure if there's actually any value to doing what you want regarding specialized insertion. As far as SQL Server is concerned, there's no difference between the two scenarios:

Scenario A:

  1. Insert Bob's Person record
  2. Insert Animals related to Bob
  3. Insert Fred's Person record
  4. Insert Animals related to Fred

Scenario B:

  1. Insert Bob and Fred's Person records
  2. Insert all Animal records related to Bob and Fred

After the insertions are complete, all select statements will return the same results regardless of which path you choose. Even if you have a trigger or some other complex logic that depends upon inserting in a particular order, I'd have a hard time seeing a case in which A works but B doesn't. I could see this making sense if we were discussing a 1980s RDBMS and you used interleaving parent and child records in a file to improve performance, but that's not really applicable here.

What you might look at instead is getting the order of insertions correct: insert into reference data tables first and go on down the chain. In your example, that's inserting into Person and then into Animal. That way, you won't have any foreign key constraint violations and all records would be up to date. Or, using a tool like Atlantis Interactive's Data Inspector, you can have it disable foreign keys during the load and then re-enable them afterward.