Since you've got many tables with varying numbers of columns, #1 would be out since you'd have a massive table with the aggregate of all your columns, and lots of nulls.
Between #2 & #3 I think you have a decision to make regarding the design complexity you want to manage. My view is that it would be easier to maintain an exact archive replica for a given table, and store the whole rowstate (with modified time). Think of a case where you update more than one column of a row. In that case #2 would log an entry for the change of the columns separately, even though it was the same transaction. I'd go w #3 for reducing complexity, and capturing point in time row state.
Have you tried piling more data and benchmarking it? 100K rows is inconsequential. Try 250M or 500M like you're expecting you'll need to handle and see where the bottlenecks are.
An RDBMS can do a lot of things if you pay careful attention to the limitations and try and work with the strengths of the system. They're exceptionally good at some things, and terrible at others, so you will need to experiment to be sure it's the right fit.
For some batch processing jobs, you really cannot beat flat files, loading the data into RAM, smashing it around using a series of loops and temporary variables, and dumping out the results. MySQL will never, ever be able to match that sort of speed, but if tuned properly and used correctly it can come within an order of magnitude.
What you'll want to do is investigate how your data can be partitioned. Do you have one big set of data with too much in the way of cross-links to be able to split it up, or are there natural places to partition it? If you can partition it you won't have one table with a whole pile of rows, but potentially many significantly smaller ones. Smaller tables, with much smaller indexes, tend to perform better.
From a hardware perspective, you'll need to test to see how your platform performs. Sometimes memory is essential. Other times it's disk I/O. It really depends on what you're doing with the data. You'll need to pay close attention to your CPU usage and look for high levels of IO wait to know where the problem lies.
Whenever possible, split your data across multiple systems. You can use MySQL Cluster if you're feeling brave, or simply spin up many independent instances of MySQL where each stores an arbitrary portion of the complete data set using some partitioning scheme that makes sense.
Best Answer
Answering one by one:
Performance: yes, this is the main problem. Not only in resource for file descriptors, but also for the mysql open table cache and data dictionary, InnoDB checks on start and regenerating table statistics when opened. To be fair, these last ones (except InnoDB checks) will be found no matter the value of innodb_file_per_table, only because a huge number of tables.
The largest problem, however -in my opinion-, was that in the case of a
DROP TABLE
withinnodb_file_per_table = 1
older versions of 5.5 required a full buffer pool lock, basically stalling the whole server. Some patches applied in the latest versions of MySQL 5.5 and 5.6, plus the buffer pool partitioning (available in 5.5 and enabled by default in 5.6) made that a minor issue.On the bright side, you can in 5.6 set a manual path of every table, which may help in some cases, for example, to put the most problematic tables on a separate disk or an SSD, etc.
Stability: I wouldn't say that it is more stable, but it is true that if you have disk corruption and a single huge file, you can only pray. Having different files makes easier to "save" all but one or a few. Not convinced that this is a decisive point -InnodB is well regarded about its disk consistency (assuming there is no physical damage).
Concurrency: This is true, but only in specific cases. Ext filesystems had a limitation that if you were using
O_DIRECT
mode, you couldn't write in concurrency to the same file descriptor. That is completely true, but is a very specific case: you could use a different filesystem (XFS is very recommended when handling many files and does not have that problem), you can use the filesystem cache if you have enough ram and in some cases the high concurrency is not a problem if writes go to different files at the same time (thus theinnodb_file_per_table=1
or several ibdata files).I cannot see your link, I suppose it is this? Yes, I agree 100% that one thing that it is missing in InnoDB is more control about the tablespace management (funnily enough, it is done for the disk tables in the memory-only engine
NDB
). I have not seen such a thing implemented. You are supposed to do all optimizations at RAID/LVM/physical level.I think that you miss one thing that is the number 1 reason for having
innodb_file_per_table = 1
- the flexibility for backups, restores, management and moving tables around inidividually (something made even easier with the transportable tablespaces in MySQL 5.6). With it enabled, you will be able to recover disk space when tables are dropped and make defragmentation possible. When you have thousands of tables in a single file, physical backup and restore is impossible, and you have to use logical backups (mysqldump/LOAD DATA).So what are the alternatives? I have found this problem before, and I have helped solving it on hosts with literally millions of tables in several ways:
innodb_file_per_table
is a dynamic variable, although you need to rebuild the table for each one you want to move in or out). This solution is ok if your problem is having some tables that are very dynamic -being created and destroyed frequently- while other are static, but you may end up with a very large common tablespace. However, if usinginnodb_file_per_table = 0
works better with you, then end of discussion.What is the number 1 way to deal with this problem? What I have been doing on behalf of my clients most of the time is what Wordpress, Facebook and other do- sharding. Do not try to handle a million of users in a single powerful machine- use smaller ones and divide load (and databases and users) between several servers. You do not have to handle the sharding all by yourself, there are tools to help you do that, like the Upstream MySQL Fabric (for Java and Python) or the engine Spider.
Please note that if the bottleneck is not on CPU/disk/memory, you can even use virtual machines in order to consolidate services.
In general,
innodb_file_per_table
is something that you wantON
, for the flexibility reasons that I told you before. If you cannot have it on (test it first, maybe you would be surprised! -avoid premature optimization), you can disable it for all or some of the tables (but the change will be more painful than anything else. But be prepared for the consequences. Try to avoid that with some of the recommendations that I told you at the end.Update: 5.7 labs releases have announced that may finally become obsolete and we will finally have better tablespace management thanks to a CREATE TABLESPACE syntax (search for "InnoDB: General Tablespaces"). With this, you will be free to create your own "innodb_file_per_database" or any other schema you want.