Mysql – Specific disadvantages to storing infrequently accessed large files in a large database

innodbMySQL

I have heard many times across the web that it is, for various reasons, a bad idea to store files directly in a database, the preferred method being to store filenames in the database, and files directly on the filesystem. Most explanations, however, seem to assume that the stored files (in my case 2MB longtext object containing XML) are important to retrieve along with the query. I've seen a few comments in passing, however, that suggest that by storing these files directly in your database you are bloating it and thereby "interfere with IO or query performance of the rest of your database."

Under what circumstances does this happen, and why?

It is my understanding that blob and longtext over a certain size were stored externally anyways. Does the database still attempt to load up the entire longtext of an entry when it's respective row is used in operations? My main concern here comes from the fact that most of the queries of the table containing these rows of longtext do not actually need the longtext xml for anything. Does simply having these cumulatively massive chunks of data sitting in the same table really affect the performance of other queries against that table?

Best Answer

InnoDB Architecture

InnoDB Architecture

This first thing that comes to mind is the InnoDB Buffer Pool. It holds three things

  • Data Pages from Accessed .ibd files
  • Index Pages from Accessed .ibd files
  • Insert Buffer for Changes to Secondary Indexes

Any query accessing infrequent data will basically remove the working dataset from the InnoDB Buffer Pool just to load the adhoc data. To get those data back, you have two options:

EPILOGUE

  • OPTION 1 is something you reluctantly live with until all subsequent queries bring back your most recent working dataset
  • OPTION 2 requires scripting adhoc reports to do the following
    • Backing up the InnoDB Buffer Pool
    • Run the adhoc queries on the infrequent data
    • Restoring the InnoDB Buffer Pool
  • OPTION 2 requires upgrading to MySQL 5.6
  • There are other ways MySQL and InnoDB are impacted. I discussed these in my recent post How to optimize MySQL for large BLOB updates