PostgreSQL strategies to deal with disk filling up

disk-spacepostgresqlvacuum

I'm using PostgreSQL (8.4) to store data produced by an application making frequent inserts (in the table structure described below).

The database keeps growing with time and, since the newer data is more relevant than the older data (in this particular application), deleting the older rows is a reasonable solution (either based on lower id or older input_datetime, which is more or less the same).

To prevent issues related from this database (the only database running on this server) from affecting the rest of the system, I've put the PostgreSQL data directory on its own partition (ext3, on a Linux system). Nevertheless, when this partition becomes full, this causes a number of problems.

I'm thinking of deleting older data regularly (e.g. DELETE FROM data_group WHERE id <= ... via a cron job) to deal with this.

Firstly, my understanding of VACUUM (as performed by auto-vacuum, which is on) is that, while it doesn't necessarily give back the disk space to the OS (like VACUUM FULL would), it still allows some new data to be inserted within the disk space already used (that is, the DELETEs don't necessarily affect the file size, but they still free space in PostgreSQL's own data structures). Is this correct? (I've noticed VACUUM FULL caused a few problems with the application itself, probably because of the locks it uses.)

If so, it also appears that SELECT pg_database_size('my_database') reflects the size used on disk, which doesn't necessarily reflect what's available for further inserts. Is there another way to estimate how much space is available for new inserts?

In addition, when it's too late and the partition is filled at 100%, running this DELETE statement causes this error and crashes the PostgreSQL service:

PANIC: could not write to file "pg_xlog/xlogtemp.7810": No space left on device

The PostgreSQL daemon stopping is of course a major issue (and there is no other disk to move the cluster to on this machine).

Are there general strategies to prevent this sort of problem from occurring (knowing that disk space is constrained within a given partition, but that it can be acceptable to delete older data)? I would like to automate as much of this as possible, without root or postgres (or PostgreSQL admin) intervention.


CREATE TABLE data_group (
    id SERIAL PRIMARY KEY,
    name TEXT,
    input_datetime TIMESTAMPTZ
);

CREATE TABLE data_item (
    id SERIAL PRIMARY KEY,
    group_id INTEGER NOT NULL REFERENCES data_group(id) ON DELETE CASCADE ON UPDATE CASCADE,
    position INTEGER NOT NULL,
    data BYTEA
);

Best Answer

On one hand, you can have a look at one of my previous answers to see how you can keep a table size more or less steady. There you will find a solution with triggers - of course, this can be solved using a cron job as well. In the latter case I would first check if the row number exceeded a certain limit and the either delete the oldest rows or drop a partition.

On the other hand, as you already noticed, one has to take care of the disk space where pg_xlog is. When it gets full, it is not that easy to recover... But checking your database settings you can have a fair estimation how much space you need:

There will always be at least one WAL segment file, and will normally not be more than (2 + checkpoint_completion_target) * checkpoint_segments + 1 or checkpoint_segments + wal_keep_segments + 1 files. Each segment file is normally 16 MB (though this size can be altered when building the server). You can use this to estimate space requirements for WAL. Ordinarily, when old log segment files are no longer needed, they are recycled (renamed to become the next segments in the numbered sequence). If, due to a short-term peak of log output rate, there are more than 3 * checkpoint_segments + 1 segment files, the unneeded segment files will be deleted instead of recycled until the system gets back under this limit.

If you don't have replication set up, the maximum is 3 * checkpoint_segments + 1 (times 16 MB). A typical replicationless setup will need something under 10 GB for pg_xlog, I think.