Mysql – Automatic MySQL Sync from master, where Slave was offline for long period of time

MySQLreplicationvirtualisation

Does anyone here know if there are any tools, not using mysql's built in replication that will sync up two databases when the slave has been inaccessible from the internet for extended periods of time?

The idea is to set up a virtual machine basis for our web developers, who can power on the VM and have it automatically clone from a slave of the production DB for the purposes of testing/development. they all require their own DB's but they should all replicate the master DB on startup.

Diagram of operation:

Production DB -> Production Read Only Slave -> Dev Virtual Machine

Best Answer

The only DB I could think of would be Percona XtraDB Cluster (PXC)

It has two methods for copying an entire Cluster Node

  • SST (State Snapshot Transfer) is the full copy of data from one node to another. It’s used when a new node joins the cluster, it has to transfer data from existing node. There are three methods of SST available in Percona XtraDB Cluster: mysqldump, rsync and xtrabackup (Percona XtraBackup with support of XtraDB Cluster will be released soon, currently you need to use our source code repository). The downside of mysqldump and rsync is that your cluster becomes READ-ONLY while data is being copied from one node to another (SST applies FLUSH TABLES WITH READ LOCK command). Xtrabackup SST does not require READ LOCK for the entire syncing process, only for syncing .frm files (the same as with regular backup).
  • IST (Incremental State Transfer) : Functionality which instead of whole state snapshot can catch up with te group by receiving the missing writesets, but only if the writeset is still in the donor’s writeset cache.

I addressed this before

CAVEAT

This would be the ideal solution if the entire database is InnoDB. If you have read-only MyISAM tables, that would be OK provided you copy the MyISAM to every node manually. SST will do a full copy of MyISAM tables, but the MultiMaster Replication feature of PXC will not replication MyISAM data.

As for PXC, I rigorously tested it in Amazon EC2 and it works like a dream. PXC works as advertised.

UPDATE 2013-02-06 16:40 EDT

drogart made the following comment:

I don't think having one node of your production cluster be a dev/testing environment is a good idea. That would mix dev write traffic back into the prod db. It would also create a situation where devs could impact the prod db cluster by spinning up a new node.

Here has a very valid point. In light of this, be very carful and sparing as to what developers can do this.

Nevertheless, should you ever need a full copy of production in a dev server, offpeak:

  • STEP01) Setup new PXC server
  • STEP02) Configure wsrep options
  • STEP03) Start MySQL (SST should kick off)
  • STEP04) Shutdown MySQL
  • STEP05) Comment out wsrep options
  • STEP06) Restart MySQL (Full of Prod Data with no connectivity to the Real Cluster)

Once this dev server is set up, do the following to update it:

  • STEP01) Shutdown MySQL
  • STEP02) Uncomment wsrep options
  • STEP03) rm -f /var/lib/mysql/galera.cache
  • STEP04) rm -f /var/lib/mysql/grastate.dat
  • STEP05) Start MySQL (SST should kick off)
  • STEP06) Comment out wsrep options
  • STEP07) Restart MySQL (Full of Prod Data with no connectivity to the Real Cluster)

CAVEAT : Do this copy of Prod to Dev during offpeak hours !!!