Postgresql – Make Postgres database temporarily read-only (for performing volume snapshots)

backupmaintenancepostgresqlread-only-database

The PostgreSQL built-in backup mechanism isn't always very suitable. Sometimes, you want to put the application in a quiescent state, because it has external data with which you want to backup at the same time you back up the PG data. But the only way to put the application in a quiescent state is to "lock" the database also. PG lacks a database-wide or cluster-wide locking mechanism. Putting PG into a read-only state would be a piece in the following solution:

Quiesce application data (disable logins)
Quiesce database (by making it read-only)
Perform a PG checkpoint or pg_xlog_switch()
Create a snapshot of the App and Data volumes
Resume the database (make it RW again)
Resume the application
Backup the snapshots

Best Answer

After culling answers elsewhere on the internet, I devised a solution. The other answers were in and of themselves, incomplete. So I am presenting an answer here in hopes it will benefit others.

The Strategy

Disable connections to the database (not the cluster).
Set the database's default_transaction_read_only setting to true.
Terminate the existing connections to that database.
Re-enable (read-only) connections.

Once that is done, you would (in my solution):

Perform the CHECKPOINT (I think this is the safest, but a pg_xlog_switch() would be appropriate for very high-load servers)
Take the volume snapshot
Reverse the previous steps. (But this is tricky!)

Pitfalls

Terminating connections while they are mid-transaction is probably a bad idea. Better to kill idle connections, wait for a few seconds, then kill idle ones, wait a few more, repeat until they are all gone.
At some point, you'll have to kill open/hung queries or abort the backup.
At the start of a transaction of session, Postgresql takes a kind of a snapshot of the process table. You have to reset this snapshot every time you go to check if unwanted processes are still there. See pg_stat_clear_snapshot()
Restoring the read-write state is not so simple. If read-only connections now exist, you must terminate them in order for the new read-write status to take effect. But new connections might arrive while killing existing ones. So again, you must
1. Disable connections to the database
2. change default_transaction_read_only status to false
3. kill the existing connections
4. Re-enable (r/w) connections to the database

Alternate strategy

Another strategy is to change the permissions on the role used by the application. This can be quite messy and is less general.

For instance, you'd have to revoke/re-grant on not just tables, but sequences, large objects, and probably the schema itself. Further, what exactly is the behavior of existing connections when you change the access? Probably no impact, which means you also need to kill those backends. Finally, let's say the application has read-write access to most tables, but not to others in the schema. You'd have to make sure your re-granting doesn't include those objects as well.

Another possibility is to LOCK all the tables, by querying the catalog and performing a dynamic query. That seemed to perilous for my tastes.

Implementation

Pause_service

The database instance name is 'gitlabhq' and the username of the application is 'gitlab'. Replace it with your own:

  psql -Upostgres  <<'PAUSE_DB'
    -- 1. disable new connections
    alter database gitlabhq_production with allow_connections = off;
    -- 2. Make DB read-only
    alter database gitlabhq set default_transaction_read_only = true;
    -- 3. Inobtrusively but safely terminate current connections
    DO $X$ BEGIN
        -- kill open idle connections, try up to 9x. Last time, kill regardless
        FOR i IN 1..10 LOOP
          PERFORM pg_terminate_backend(pid) from pg_stat_activity where usename = 'gitlab'
            and (i >= 10 OR state in ('idle', 'disabled' ));
          PERFORM pg_stat_clear_snapshot();
          EXIT WHEN NOT EXISTS ( select pid from pg_stat_activity where usename = 'gitlab' );
          RAISE NOTICE 'pg backends still open: sleeping 2 seconds';
          PERFORM pg_sleep(2);
          PERFORM pg_stat_clear_snapshot();
        END LOOP;
        -- send notice if still open connections
        IF EXISTS ( select pid from pg_stat_activity where usename = 'gitlab' ) THEN
            RAISE NOTICE 'Hung backends. Backup might not be 100%% consistent';
        END IF;
    END;$X$;
    -- 4. Allow read-only connections while checkpointing/snapshotting
    alter database gitlabhq with allow_connections = on;
    CHECKPOINT;

Resume

    alter database gitlabhq_production with allow_connections = off;
    alter database gitlabhq set default_transaction_read_only = false;
    SELECT pg_stat_clear_snapshot();
    SELECT pg_terminate_backend(pid) from pg_stat_activity where usename = 'gitlab';
    alter database gitlabhq with allow_connections = on;

There's a possibility that in this last step you will kill long-running read-only/SELECT queries, but in my experience, such long-running queries can last minutes if not hours, and it's acceptable to kill these in order to ensure uptime for everyone else.

Best Answer

The Strategy

Pitfalls

Alternate strategy

Implementation

Pause_service

Resume

Related Solutions

Postgresql – How does PostgreSQL handle Checkpoints in the middle of a WAL-enabled backup

PostgreSQL “freeze”/“unfreeze” command equivalents

Atomic file-system snapshot

pg_start_backup / pg_stop_backup

pg_basebackup

pg_dump

Write quiescence

Related Question

`pg_start_backup` / `pg_stop_backup`

`pg_basebackup`

`pg_dump`