Mysql – Running two MySQL servers that use the same data directory

dbmsMySQL

The situation

I have two MySQL instances, deployed on two different remote machines.
Both MySQL instances access the same data directory, which is stored on a persistent volume (this volume is provided by Gluster, but that's not really relevant, I'm only saying this for the sake of completeness).

I'm aware that it is strongly discouraged to run two MySQL servers using the same data directory, but it's not an impossible thing to do, as long as one takes the right precautions. In order to do this, I followed the steps suggested by the official documentation ( https://dev.mysql.com/doc/refman/5.6/en/multiple-data-directories.html ), more specifically in the paragraph named Warning. To sum it up:

  • I'm using MySQL 5.6.
  • I'm using MyISAM as default engine, as required.
  • I specified log file names that are unique to each server.
  • One server is going to be read-write, while the other server is going to be read-only.

So, I set everything up, I imported a test database and everything worked flawlessly. Both servers could read data from the same data directory.

The problem

Now comes the problem. As soon as one server performs a write (by inserting a row into a table), the other server marks the table as corrupted for itself.
After looking into the problem for a while, I found out why this happened. Basically each server holds its own metadata describing each table. So, as soon as the first server updated a table, it also updated its own metadata, but it didn't update the metadata of the second server. As the second server checked the table, it noticed a mismatch between the current number of rows in the table and the number of rows it had previously saved in its metadata, therefore it marked the table as corrupted.
After some research, it turns out that the metadata is saved inside a default database named information_schema. At first, I thought of making the two server share the same information_schema, but this was nowhere to be found. As I later discovered, it's saved inside the memory of the program and there's no way you can manually update or access it.

The questions

  1. Since the documentation tells you exactly what precautions to take, I was expecting this to actually work…or else, what's the point of that guide in the official documentation? If this couldn't be done, why would it tell you how to do it in the first place?
  2. Most importantly, do you know how to make this work, using MySQL?
  3. If not, do you know of any other DBMSs that allow you to obtain this result (that is, running two servers that use the same data directory)?

Just keep in mind that I don't want to set up master-slave replication.
All other suggestions are welcome, thanks everyone for the help.

EDIT: I ended up not doing it, as everyone discouraged me from doing it. Thanks everyone for your input!

Best Answer

  1. Since the documentation tells you exactly what precautions to take, I was expecting this to actually work...or else, what's the point of that guide in the official documentation? If this couldn't be done, why would it tell you how to do it in the first place?

  2. Most importantly, do you know how to make this work, using MySQL?

You said something that might have a bearing on the first two question

Both MySQL instances access the same data directory, which is stored on a persistent volume (this volume is provided by Gluster, but that's not really relevant, I'm only saying this for the sake of completeness).

Interestingly, someone has claimed to have pulled this off.

On Sep 24, 2010 (nearly either years ago), Richard Holloway answered a question he himself posted in Server Fault : Can I run mysqld on top of glusterfs?

Quoting Richard Holloway's answer

This is a very daring answer ( Godspeed, Spiderman !!! )

Just to throw in my 2 cents

  • Do not use GlusterFS in async mode !!!
  • Configure skip-innodb on both mysqld instances
  1. If not, do you know of any other DBMSs that allow you to obtain this result (that is, running two servers that use the same data directory)?

The answer is Oracle RAC. It is designed to operate multiple Oracle instances sharing only one set of datafiles. Each instance has its own log buffer. Oh yeah, you will need boat loads of money (Oracle RAC is not open source).