Sql-server – One generic mapping table VS many specific mapping tables

database-designsql servertable

I'm trying to design database tables to track files uploaded by end users.
Files can be uploaded under different contexts.
Each context is a different table.

As an example:

  • An employee can upload multiple "receipt" files against a "weekly expense" entry.
  • An employee can upload multiple "photo" files against a "pets" entry.

The entities / tables in this (contrived) scenario are:

  • Employees (Id, Name)
  • Expenses (Id, Date, RequestedRefund, RequestedByEmployeeId)
  • Pets (Id, Name, Type, BelongsToEmployeeId)

I already have a table called Files which tracks information about the file itself:
Files (Id, Name, Size, Extension, Folder)

My question is around how to map a expense receipt upload and pet photo upload against the file record. I know I can do this in 2 ways:

1st Option:

Have one generic mapping table: GenericFileMap (FileId, ContextId, Type)
where

  • FileId is the id of the file record
  • ContextId is the id of the context record that i'm trying to retrieve
  • Type is the field to describe the context itself

In this scenario, I'd fetch all expense receipts for an employee like so:

SELECT *
FROM Employee e
INNER JOIN Expenses ex ON e.Id = ex.RequestedByEmployeeId
INNER JOIN GenericFileMap g ON g.ContextId = ex.Id AND g.Type = 'expense'
INNER JOIN Files f ON g.FileId = f.Id

2nd Option:

Have separate mapping table for each entity / context table:

  • ExpenseFiles (ExpenseId, FileId)
  • PetFiles (PetId, FileId)

The query for receipts would be:

SELECT *
FROM Employee e
INNER JOIN Expenses ex ON e.Id = ex.RequestedByEmployeeId
INNER JOIN ExpenseFiles ef ON ef.ExpenseId = ex.Id
INNER JOIN Files f ON ef.FileId = f.Id

I believe the 1st option is not the "normalized" way to do things in a relational database. But the only reason i'm considering this is, in my particular situation, there are at least 15 different contexts from which a user can upload / attach files.

The only benefit i see from the 1st option is the time I'd save creating and duplicating tables and stored procedures (one each for CRUD) for each and every context.

However, it does seem that the 2nd option is the "right" way to do things.

My questions:

  1. Which of the two options is the smarter approach?
  2. Which of the two is better for performance in the long run (I'll definitely put in the extra effort needed if the second option will be more performant) ?

Best Answer

You don't want option # 1 because:

  1. You might as well just place ContextId and Type into the Files table.
  2. You would be better off with a TypeID TINYINT field (and the associated Type lookup table) so that you didn't need to do an inefficient string comparison each time.

What is probably best is a modified option # 2. Yes, start with the context-specific file map tables (e.g. ExpenseFiles, PetFiles, etc) but also add a ContextTypeID TINYINT column in the existing Files table. There should also be a new ContextType lookup table with ContextTypeID TINYINT NOT NULL as the Primary Key (but do not make it an IDENTITY column). You can add a Name column to the lookup table, and a Foreign Key from Files to ContextType on ContextTypeID. Having this new column in the Files table will make it much easier to determine which context-specific file map / property table the associated record is in.

The reason for going with option # 2, even if there are 15 different contexts, is that, inevitably, at least some of them will evolve and diverge over time. So you might have:

  • ExpenseFiles (ExpenseId, FileId, Date, Total)
  • PetFiles (PetId, FileId, Name, Age, AnimalType)