Sql-server – What process would you use to determine references on a database that has no integrity

database-designdocumentationrelational-theorysql server

Imagine you inherit a database which has a bunch of tables, no documentation and zero referential integrity. Now you're trying to find out what are the various relationships between different tables.

What process would you go through to do this e.g. finding common data patterns, matching these across tables?

Best Answer

You could programmatically build queries to test which combinations of columns look like they might be keys. In order to test a given relationship you'd run a query of the form:

SELECT 1 AS [IsPossibleKey: ChildTable.KeyColumn => ParentTable.PrimaryKeyColumn]
WHERE NOT EXISTS (
 SELECT *
 FROM ChildTable c
 WHERE NOT EXISTS (
  SELECT *
  FROM ParentTable p
  WHERE c.KeyColumn = p.PrimaryKeyColumn
 )
)

This can give you a fairly raw list of key relationship candidates. I would write a SQL query to programmatically generate all possible queries of the above form, paste all of them into a new SSMS window and run time.

You probably should exclude candidates with zero non-NULL values in the child table column.

Another problem is that bad data might have slipped in. Heck, since there are no FKs it almost certainly has slipped in somewhere.

Also, this only finds single-column combinations.