MySQL Database Design – Storing Files with Extension as BLOB

blobdatabase-designfilesMySQLoptimization

Recently during development process I've faced file storage issue. I'm not sure which solution is the best one that's why I decided to write this question. I don't have a lot experience of database administration that's why any tip can be really useful for me.

In my application project I have several types of files that have relationship with specific tables, for example:

  • Invoice has one scan
  • User has one document
  • Transaction has one receipt
  • Building has many photos

… etc.

All those files: scan | document | receipt | photo have different extensions(pdf/jpg/png etc.).

I decided that files corresponding for each table will be stored as BLOB datatype. Because of BLOB usage there is need to store file extension somewhere in database.

I was considering few possible solutions:
(For demonstration purposes I will attach simple tables visualization)

enter image description here

user | invoice | building_photo | transaction have require file stored as BLOB or MEDIUMBLOB, but extension of files is missing. What would be the best way of storing that? as VARCHAR column in each table or as separate table?

Example of extension storage in each table:

enter image description here

This solution is the easiest one to implement, extension as additional VARCHAR is handy when I'm inserting new data, but I'm repeating this information since I will have two files with the same extension.

Example of extension storage in separate table:

enter image description here

This solution seems to be more optimized but little bit harder to manage while inserting new data. I would be grateful for any suggestion about problem described above.

Another implementation that I've considered is to not store all files in separate tables but instead create one table called file that will contains file BLOB and then invoice | user | transaction | building etc.
will have relationship to that table. Visualization below:

enter image description here

In this case again I'm keeping extension in separate table, but file data as well. All files are in the same place but I'll end up with large table and a lot one to one relationships. If such solution makes sense also I would be grateful for option, or maybe there is another way of doing such things?

Best Answer

I think "user has one document" (and the others) is really a business rule - something you might need to change in the future, not a data rule. Therefore it should be enforced by triggers / constraints, not your schema

I'd go with a schema like this:

USERS -< USER_DOCUMENTS >- FILES

INVOICES -< INVOICE_SCANS >- FILES

BUILDINGS -< BUILDING_PHOTOS >- FILES

TRANSACTIONS -< TRANSACTION_RECEIPTS >- FILES

Where FILES looks like this:

FILES
file_id
content_type
name
data (BLOB)

While we're at it "transactions" is a very overused term in databases.