Sql-server – Category entity relationship / subtype relationships design

database-designsql serversubtypes

I am looking to implement a generic comment and file attachment facility within a web application which would be used across various parts and allow users to provide comments and add attachments to different entities.

Whilst looking for a suitable solution I came across this answer and I wanted to check I am fully understanding the implementation of this approach and understand the potential pitfalls.

Using a schema similar to the one below the ContractId, ContractLineId and VariationId values, along with any future entities, would be generated using a SEQUENCE (using SQL Server 2012) and each generated value would be stored in the Entity table along with its EntityType.

The use of the SEQUENCE across multiple entities would allow for a single Comment and Attachment table and therefore a Comment or Attachment could be created against a Contract, ContractLine and Variation and any future entities.

The Entity table is used as a sort of junction table (or some other term) and would allow queries to return all Attachments against particular entity type(s).

Though things get a little messy if attachments against multiple entities need to be returned along with some information about the entity. For example, if all comments made by a user were to be queried along with the name of the entity the comment was against.

Am I going down the right path or should this be avoided.

enter image description here

Best Answer

I think this looks pretty reasonable for a system in which a user can add comments or attachments associated with a contract, a specific line of the contract, or a proposed modification to a line of a contract.

Since you asked for potential shortcomings or pitfalls, here are a few you might consider:

  • Without knowing much about your system, it seems strange that a variation must apply to only one contract line. Is there ever a case where a variation might apply to multiple lines? And then a user might want to comment on the multi-line variation in its entirety? If so, this case doesn't appear to be handled right now.

  • It's surprising that there aren't any columns other than EntityType that are common across all entities. Perhaps there should be an EntityName column rather than separate columns ContractName, ContractLineName, and VariationName? This would make the example query you describe in your question far simpler.

  • The data model does not strictly enforce that the Variation contains logically consistent data. It stores both a ContractId and a ContractLineId. The foreign keys for these columns enforce that each is valid independently, but allow for a ContractLineId to be inserted that comes from the incorrect ContractId (that is, different than the ContractId for that ContractLineId in the ContractLine table). One option to address this is to remove the ContractId column from the Variation table.