Postgresql – When using a master plus multiple slaves, how to change over from one server to a brand new one with minimal downtime

change-managementpostgresqlreplication

In my particular case, I'm using Heroku Postgres (repackaged Amazon EC2 instances running PostgreSQL), but platform-agnostic answers are also good!

Heroku's instructions
describe the process for transfering over a single database (i.e. without slaves):

  • You have an existing server, A.
  • Create server B as a follower of A.

    $ heroku addons:add heroku-postgresql:ronin --follow HEROKU_POSTGRESQL_AUBURN
    ... Attached as HEROKU_POSTGRESQL_BLUE
    
  • Wait for B to be up-to-date with A.

  • Disable the application so no new writes are made to A.
  • Wait for B to fully catch up.
  • Sever the connection between B and A, so B is an independent db.

    $ heroku pg:unfollow HEROKU_POSTGRESQL_BLUE
    
  • Reconfigure the app to use B as your database.

    $ heroku pg:promote HEROKU_POSTGRESQL_BLUE
    
  • Re-enable the application.

  • Destroy A.

    $ heroku addons:remove HEROKU_POSTGRESQL_AUBURN
    

This all works fine if you have a single database, but what happens when you have slaves? Heroku's docs only say that you'll have to re-create the slaves, without going into further depth. (n.b. Once you've done the process above, you can't immediately create the new slaves. The server needs a few hours before it can be forked/followed.)

Now, if your app can handle ditching the slaves entirely, then you can do the following:

  • disable the slaves
  • change over the master server
  • create new slaves
  • enable the new slaves

But if reducing your server farm to just a single server is going to cause the app to crash and burn, you need a way to keep your slaves up and running throughout this whole process!

My first thought was to create a server C, following B, even while B is still following A. When switching over, B becomes the new master and C the new slave. But, alas, Heroku (and presumably also PostgreSQL) doesn't support followers-following-followers.

QUESTION: How do you change over from one whole master/slave set to another set? How do you change over your master without having to run without slaves for an extended period of time?

And just in case it's note clear: I'm not asking about failing over to an existing slave. I want to upgrade all of the servers to a different-sized EC2 instance. And I want to maintain my master/slave setup throughout the process.

Best Answer

I don't know how well this works on Heroku but I think it should work if one is using streaming replication internally. I would recommend repmgr which is an open source replication cluster management tool from 2ndQuadrant. This provides a general monitoring and management framework for replicated database clusters. In 9.3, repmgr will be less necessary but still may be helpful for integration into heartbeat systems. It provides a command line tool for failovers, preparing slaves, etc.

If you have to do something like this manually and repmgr wont work for you, basically, what you need to do is this:

  1. turn off the master (shoot the other node in the head)

  2. promote the slave to a master

  3. re-base your other slaves and point them at the new master

  4. turn your old master into a new slave when you can bring it up.

How you do this will depend to some extent on the replication approach.