For audit purposes, I'm trying to obtain a copy/export of a DB but including only the rows from each table that have relevance to a small collection of rows in some table. So every row of every table that these rows point to; every row of every table that points to these rows; and so forth. (In algebra terms this is the transitive-symmetric-reflexive closure of those rows with respect to the foreign key relation.)
I'm wondering if there exists some kind of tool to do this, or if this is scriptable somehow. I have ~400 tables with a tight network of foreign keys so I'd rather not do this by hand.
Best Answer
You could certainly script it with a nested stored procedure:
EXEC recursive_shaped_query @StartTable='SomeTable', @KeyColumn='Id', @ValuesSQL='SELECT Id FROM (VALUES (1),(6),(4576),(42)) AS Ids(Id)'
SELECT * FROM SomeTable WHERE Id IN (SELECT Id FROM (VALUES (1),(6),(4576),(42)) AS Ids(Id))
sys.foreign_keys
to find any tables that reference this table and…EXEC recursive_shaped_query @StartTable='ReferencingTable', @KeyColumn='Id', @ValuesSQL='SELECT Id FROM SomeTable WHERE Id IN (SELECT Id FROM (VALUES (1),(6),(4576),(42)) AS Ids(Id))'
Of course there would be a number of complications to consider:
SELECT * FROM t1 WHERE Id IN (SELECT Id FROM t2 WHERE Id IN (SELECT Id FROM Tt3 WHERE ... )))
nesting may get silly long and be queries that take an age to run due to table scans if useful indexes are not in place.Other things to note:
For our systems in [Day Job] I have a set of such queries manually created to pull out nested data for given entities as these procedures are useful for diagnostics, but I've never needed nearly enough of them to warrant trying to create a generator or other automated solution. I've thought about it as a play project, but not actually done it.