Postgresql – pgpool-II configuring multiple master databases

pgpoolpostgresql

I could not find a solution for this anywhere, not even in the pgpool manuals.

How do I configure multiple – master postgresql databases with pgpool-II? Additional requirement is that each master should have its own set of slaves using streaming replication. The reason for multiple master is that each master holds a partition of a large data set.

It is a large data set (say about 100 million users, their account information, plus additional information about their accounts) which is partitioned; so it is siloed. The data model is being designed so that the silos do not need to talk to each other.

Any help and guidance is much appreciated.

Best Answer

OK your first decision is WHERE to split the silos out, in the application layer, or in the db layer. If your application will keep track of the silos then you can run individual pgbouncer or pgpool servers on different ports for each silo.

If you want the application to not have to know or deal with how siloing is done, then something like plproxy can work wonders. Basically you have 1 or more plproxy servers that distribute all the queries / data based on some pre-determined modulo math or some other method for breaking it up, and your app just calls the plproxy server with it's queries and the plproxy server is where the logic for siloing exists.

Each method has it's pluses and minuses. The nice thing about it being in the app layer is that your database layer is fairly simple. just a bunch of master/slave pairs. If you add a new silo you just edit the config file for it in the app and it goes online. The disadvantage is that all apps hitting your data HAVE to be silo aware etc.

With plproxy the advantage is that the app never knows its siloed. If you write a fairly simple SOAP type interface to talk to the plproxy servers then every app your write can use that single interface and never have to be told which silo does what. Now though the complexity is moved into the db layer. Any changes to the farm structure require work on the db level. Also plproxy works by basically wrapping all your database work into functions ahead of time. If you want to select data from a table joined to another table you need to create a function to do that and call that function on the plproxy machine. It seems overly complex at first but actually it's not that hard, just quite different from slinging random SQL at your db servers.

Related Solutions

Postgresql – PGPool Frequent Queries

I don't think these are issued by PGPool. I think they are instigated by your application (which I am guessing is written in PHP and running over some sort of ORM).

The first entry is DEALLOCATE which means effectively to tell the server to free up memory from a prepared statement. These look like they are issued through some PDO module.

The second and third queries look like they are ORM-related mapping queries. Your application is probably unware that these queries are being issued by the framework it is running on, but these queries only make sense really in an ORM or similar environment.

PostgreSQL – PGpool Promote to Master and Replication Issues

Ok so. By design, pgpool does the above.

The follow_master_command in the pgpool.conf file is what you need to use to resolve this issue.

I use the following script (follow_master.sh which i place in the /etc/pgpool-II folder)

#!/bin/sh

################
##
## $1 = node id
## $2 = Old master node id
## $3 = node hostname
##
############### 
PGPOOLIP=10.**.**.**    
PGUSER=postgres
PGPASS=*************
PGHOME=/var/lib/pgsql/9.3
REMOTE_PGDATA=/var/lib/pgsql/9.3/data

if [ $1 = $2 ]; then
        /usr/bin/pcp_detach_node 10 $PGPOOLIP 9898 $PGUSER $PGPASS $1
else
        sleep 5
        ssh -T postgres@$3 "
        LD_LIBRARY_PATH=$PGHOME/lib:LD_LIBRARH_PATH;
        $PGHOME/bin/pg_ctl -w -D $REMOTE_PGDATA stop"
        /usr/bin/pcp_detach_node 10 $PGPOOLIP 9898 $PGUSER $PGPASS $1
        /usr/bin/pcp_recovery_node 10 $PGPOOLIP 9898 $PGUSER $PGPASS $1
        /usr/bin/pcp_attach_node 10 $PGPOOLIP 9898 $PGUSER $PGPASS $1
fi

I've done testing and it seems to be working fine.

I hope this helps someone in the future.

Thanks

Rob

Best Answer

Related Solutions

Postgresql – PGPool Frequent Queries

PostgreSQL – PGpool Promote to Master and Replication Issues

Related Question