Mysql – Minimizing database size (Many small tables)

database-sizeinnodbmariadbMySQL

Our database is interesting in that we have a large amount of tables that take up a correspondingly large amount of space, but when we do a MySQL dump of said database, it is surprisingly small.

Specifically, the database is about 50GB, and contains about 50k tables. When dumped, it takes up about 5GB. (These numbers aren't spot on, but they're close enough for our discussion here). It's particularly odd to me that the "binary" DB takes up so much more space than the SQL-dumped DB.

The vast majority of the tables (like 49990) are relatively small WordPress Multisite tables. These are seldom-used sites with very little content on them.

If it's relevant, we're using innodb_file_per_table.

What is the best way to minimize the size of our database? Does minimizing the size of the database come at a cost of lower performance? Ultimately, I'd like to reduce the size of the DB in order to increase performance (e.g. with regard to doing backup/restore operations).


Update: The structure of the tables is essentially just the default WordPress Multisite layout: http://pastie.org/private/iufzw8z9zlyidqw8b7wggw Note that I looked into some more accurate numbers, and it appears we have about 9k Multisite instances for a total of nearly 80k tables. The larger numbers are partially due to our service continuing to grow and add new customers.

Best Answer

"Does minimizing the size of the database come at a cost of lower performance?"

Typically databases are IO limited, unless they are regularly recalculating reporting style queries. (in which case pre-calculated views can be added to again shift cpu to IO.)

Minimizing the size of the database typically minimizes the IO needed against the disk, as a higher % of all data can be in cache.

Compressed storage can also help here; if 1 disk iop can retrieve more rows, then that too can increase performance, without needing to logically re-arrange the data. (many compression schemes are cpu-efficient enough that it is possible to actually see a decrease in cpu usage, due to having to handle less data pages.)

So, in general, minimizing the size of the database does indeed increase performance, but always benchmark as there are many counter-examples to any generic statements like this one (note, the answer by RolandoMySQLDBA lists some downsides of compression concerning memory pressure).

some of the tradeoffs due to data compression are listed at http://dev.mysql.com/doc/refman/5.5/en/innodb-compression-internals.html#innodb-compression-internals-storage