Postgresql – Security considerations of allowing logical replication subscribers on PostgreSQL

postgresqlreplicationSecurity

I run a non-profit dedicated to sharing data kind of like Wikipedia.

We recently had a client that wanted a replica of our database, and we realized that by using PostgreSQL's new logical replication we can replicate tables of our DB to a server they control.

This would be great for fulfilling our mission of sharing the data. It's 100× better than providing slow APIs.

We created a role for them like this:

CREATE ROLE some_client WITH REPLICATION LOGIN PASSWORD 'long-password';
GRANT SELECT ON TABLE some_table TO some_client;

And we created a PUBLICATION for them like this:

CREATE PUBLICATION testpublication FOR TABLE ONLY some_table;

Is there any risk of doing this? My analysis is that this gives them SELECT access to a table that they're replicating to their own server, but that's it. Any other concerns? If there are concerns, are there ways to make this work? We have tables we don't want to share, but most of our tables only have public data.

Best Answer

Note that the GRANT SELECT is only needed for the subscriber in order to do a bulk copy of the initial data in the table. If they do not have that GRANT, they can still start a subscription, and they would be able to see any newly inserted rows or the new version of any updated rows. The fact that the GRANT SELECT is needed for full functionality seems like wart in the design, as declining to GRANT the SELECT access does not deny them access to all the data, merely the subset that already existed and was never updated. This is confusing in a way that could lead to security mistakes.

Also, there is no fine-grain control to which users can subscribe to which publications in a database (at least, not that I can find). Any user WITH REPLICATION who can connect to a given database can initiate subscriptions to all publications in that database. Combined with the previously discussed limitation in which denying SELECT on a table doesn't deny all visibility to the contents of the table, this is really limiting. Although maybe not in your use case, if you only have one outside party, or if all outside parties need access to the same thing.

Cascading logical replication is possible. So you can gain some fine-grained control by creating one "sterile" database for each publication you want separate control over. In that sterile database, you create a subscription to the publication in the main database (connecting to it with some internal role if course, not the external user's role), and create a publication of those same tables. Then you grant the external user's role CONNECT (or using pg_hba.conf settings) to only the sterile database and not the main database. You could do this either with a separate database server, or a separate database within the same database server. It does annoyingly increase the storage needs, and the CPU overhead of republishing the data internally.

In order to let the outside party connect in order to do the replication, you will have to poke a hole in your firewall to let them in. Doing that could open you up to various miscreants trying to hack into your database. So make sure you have strong passwords (or some better form of authentication, like client certs) on all roles, not just "some_client". Or poke the firewall hole only for the client's domain or IP address range. Or make sure all your roles in pg_hba other than "some_client" are allowed only from internal IP addresses. Or all of the above.

Finally, you open yourself up to accidental (or even intentional) denial of service attacks from this client. If they fail to keep their database up and running and following along with your publication, it would cause your database to start retain pg_wal files, perhaps to the point where it fills up your storage system and causes downtime. So you should have some way of monitoring pg_wal size and be prepared to forcibly drop the replication slot if it is not keeping up.