There isn't currently anything in the PostgreSQL core which would help with that, but there is work in progress to do what you want that is likely to make it into version 9.3, which will probably be released in the summer of 2013. If you can't wait for that, there are a number of solutions available from third parties. From what you describe, Bucardo sounds like it might be the best fit.
A new option, which recently hit production release 1.0, is Postgres-XC. It is exciting technology, but if you have a stable production environment you might want to just keep an eye on it for now; it's pretty "bleeding edge" at the moment, and they haven't yet figured out how to incorporate all of the features of normal PostgreSQL in that environment.
The pg_xlog_location_diff within your query may return more than 0.
There are three writing timing of WAL data to WAL segment file: 1) When a transaction is committed, 2) When WAL buffer is filled up, 3) When WAL writer process writes periodically (default is 200 milliseconds).
And basic sequence of streaming replication is shown below:
- (master) Write WAL data.
- (master) Send WAL data to a slave.
- (slave) Receive WAL data and replay it.
- (slave) Return some information (e.g. replay location, receive location, flushed location.)
I show an extreme example to clarify the problem. If you run a long transaction to insert huge data, master continues to send WAL data till the transaction is committed. When you issue your query during the period from 2 to 3, pg_current_xlog_location()
returns the current location whereas replay_location
returns the previous replay location, pg_xlog_location_diff(pg_current_xlog_location(), replay_location)
therefore returns more than 0.
Other case is the following: The configuration parameter max_standby_*_delay
is enable and a conflict occurs, your standby suspends replaying WAL data, therefore pg_xlog_location_diff()
returns more than 0.
(Both of examples are logical extreme cases.)
In any case, I think your problem does not depend on whether replication mode is sync or async. The difference between sync and async is that the master confirms a synchronous slave commits transactions or not.
Your problem is related to general sequence of WAL data sending and replaying, but is not depend on the synchronous mode.
(Consider how a long transaction is committed in synchronous mode. I discussed what happens when your query is issued during the period from 2 to 3 (it's very short period) in the streaming replication sequence I showed. I did not discussed when transaction is committed because it is not essential of the answer, i think.)
Best Answer
Yes,
There is a setting for this which is called synchronous_standby_names.
You can give expression like
ANY 3 (r1, r2, r3, r4)
which says "proceed commit as soon as at least any three standbys reply"If you want to specify exact replication name just name them like
r1,r4
.Here is sample usage;
Note: The standby names (which are r1,r2..) are defined by application_name option of standby's primary_conninfo setting value.