Sql-server – How to model very flexible referencing relationships between many tables

database-designreferential-integrityschemasql server

I have various different entity types in my database. For purposes of example, consider an IMDB database:

  • Movie
  • Person
  • Actor
  • Director
  • Translation
  • and 20 more tables

My application enables free-form many-many relationships between any of these entities.

What is a sane way to model this in a maintainable way?

Currently, my app uses a table with "a_id", "a_type", "b_id", "b_type". With the type identifying the table (e.g. "movie"). This means there is no referential integrity checking, which makes me uneasy.

What are the alternatives? The only one that comes to mind is to expand the linking table with a column for every other table. So I have a_movie_id, a_actor_id, a_director_id and b_movie_id, b_actor_id, b_director_id. However, while this preserves referential integrity, it feels strange (I have not seen such tables before) and I suspect it would be difficult for application code to process it (it would have to have a big if-statement to check what is actually referenced).

Thus I ask for your advice – is there a convenient industry standard way to model such flexible relationships?

My database engine is is SQL Server.

Best Answer

This is opinion-based. I wouldn’t try to suggest there’s a best-practice here.

If you have separate tables for actors, directors, movies, and have a generic EntityID column with separate EntityType column, you’ll always need to use dynamic SQL to query it, because until you know the type, you won’t know which table to join to.

Another option is to have a generic table of entities which contains the type. Your relationships table is then just two columns.

Then you have sparse columns in your entities table for all your properties, as directors have different attributes from movies.

And you index it well, so that you’re not trawling through all the actors when you want a list of movies. Filtered indexes can be particularly handy here, so that you have a dedicated index for movies which includes the movie-related columns.

Views can make querying easier too, but for traversing relationships you can use the generic table.