I wrote up an interesting layout last year which features DRBD pairs in two data centers (DC1,DC2) with as follows
- DRBD Pair in DC1 (db1 and db2)
- DBVIP for Primary of DRBD Pair1 is 10.1.2.30
- DRBD Pair in DC2 (db3 and db4)
- DBVIP for Primary of DRBD Pair2 is 10.1.2.40
- Have MySQL Circular Replication Between DRBD Primaries
- Have the 10.1.2.40 as Master_Host for DBRD Pair 1
- Have the 10.1.2.30 as Master_Host for DBRD Pair 2
MySQL high availability, failover and replication with Latency
MySQL Replication : 1 Slave / Multiple Masters
Here is why I suggested this: Using two data centers, you setup automatic failover for DRBD Pair in one data center. Let the other DRBD Pair in the second data center be for DB disaser site with it own local redundancy and failover. Should you ever loses one data center, the other data center is fully read with it own local failover setup. Your app would just have to use the DBVIP of the other database center in such a catastrophic case.
Please keep in mind that using DRBD in conjunction with MySQL is only beneficial if all of your data uses the InnoDB Storage Engine. Hard failovers in DRBD could easily result in crashed MyISAM tables.
Here is another setup to consider:
As with DRBD setups, any DRBD Secondary would provide just a Disk-Level copy of your MySQL Folder. It is available as a warm standby. MySQL is not being run on the DRBD Secondary. If you want the third DB server to become hot standby, ditch DRBD altogether and use pure MySQL Replication. With three DB servers, using db3 at a remote site, simply setup the following:
+--> db1 --> db2 --> db3 -->+
^ |
| V
+<--------------------------+
Using your rudimentary failover, now you have two hot standby servers. You just have to make sure each DB server has a unique server_id value. I also recommend using MySQL 5.5 because it uses SemiSync Replication which is more sensitive to communication failures and stop Replication better. You will have to setup the appropriate heartbeats and timeouts.
The tutorial you linked to is, I'm afraid, very optimistic. It forgets to note that it only covers the case of an auto generated AUTO_INCREMENT PRIMARY KEY
.
The methods it suggests do not cover the general case for a UNIQUE KEY
, including, of course, a general PRIMARY KEY
.
The article does mention that "The most common problem with replication is primary key collision". It then completely fails to mention all the other problems...
So, anyway, any UNIQUE KEY
collision will render your replication broken, possibly in both directions.
Which table was it that failed your replication? Was that a table with an AUTO_INCREMENT PRIMARY KEY
or a "general" PRIMARY KEY
(ie on non-AUTO_ICNREMENT
columns)?
To solve the general case of unique keys, your application must be very aware of its location (Netherlands or Washington DC), and must refrain from writing data that may cause collision. Alternatively it might want to use some synchronizing/locking mechanism; that would of course induce latency as data must cross the Atlantic.
Might I also suggest that the use of binlog_do_db
is also naive and dangerous. Are you sure you don't want to replicate the mysql
schema? What happens when you add new schemas? The general correct solution is to have full replication without filtering, or otherwise have a very good explanation why not to do so.
Finally, I highly discourage the use of active-active master-master replication. I had customers using it in the past. The amount of trouble they went into because of that setup, and in spite of my counter-recommendations, led to many ruined weekends and holidays.
You might want to look at Galera replication. It's a synchronized master replication solution, and I hear it works over WAN. This will allow your application to remain ignorant. Watch out for network failures, as one part of your cluster will turn immutable.
Best Answer
If you are considering PostgreSQL 9.0 with Hot Standby, set up the archive so that the archive files only become available to the slave after a certain time. For example, you could have a little shell script that hardlinks the files from an incoming archive directory to an outgoing one based on file timestamps. Make sure you preserve the order. (Naturally, don't activate streaming replication if you do this.)