Postgresql – postgres database disk issue

postgresql

I have a postgres database running on my application, and the postgres partition is 100% full. Upon checking the files in pgdata directory is see below files occupying lot of space.

smsstoregt1 ins /postgres/pgdata/base/16384> du -skh * | grep -i M
589M    16400
244M    16405

Please can anyone suggest how to cleanup the space and if we can move the files 16400 and 16405 to another path or delete them, or if we can purge any data from postgres database from DB login.

Below is my postgres partition

/dev/mapper/mpath0p1  939M  900M     0 100% /postgres

Best Answer

You can create a new tablespace on a different disk and then move those tables to that new tablespace.

First create a new table space on a disk that has enough space:

create tablespace large_disk 
   location '/path/to/directory/with/more/space'
   owner your_postgres_user;

By making your_postgres_user the owner of that tablespace, the user automatically gets all privileges on that tablespace.

Then you need to find out to which tables those files belong to:

select 16400::regclass, 16405::regclass;

Then move the tables:

alter table public.table_one
    set tablespace large_disk;

alter table public.table_two
    set tablespace large_disk;

Related Solutions

PostgreSQL strategies to deal with disk filling up

On one hand, you can have a look at one of my previous answers to see how you can keep a table size more or less steady. There you will find a solution with triggers - of course, this can be solved using a cron job as well. In the latter case I would first check if the row number exceeded a certain limit and the either delete the oldest rows or drop a partition.

On the other hand, as you already noticed, one has to take care of the disk space where pg_xlog is. When it gets full, it is not that easy to recover... But checking your database settings you can have a fair estimation how much space you need:

There will always be at least one WAL segment file, and will normally not be more than (2 + checkpoint_completion_target) * checkpoint_segments + 1 or checkpoint_segments + wal_keep_segments + 1 files. Each segment file is normally 16 MB (though this size can be altered when building the server). You can use this to estimate space requirements for WAL. Ordinarily, when old log segment files are no longer needed, they are recycled (renamed to become the next segments in the numbered sequence). If, due to a short-term peak of log output rate, there are more than 3 * checkpoint_segments + 1 segment files, the unneeded segment files will be deleted instead of recycled until the system gets back under this limit.

If you don't have replication set up, the maximum is 3 * checkpoint_segments + 1 (times 16 MB). A typical replicationless setup will need something under 10 GB for pg_xlog, I think.

PostgreSQL Database Design – Need Help Designing a Database

If you have an index on LastLoggedIn, deleting a few tens of thousands of records should be fast without needing to partition. I ran a small test on MS SQL:

CREATE TABLE Temp.Temp
(
UserID  INT NOT NULL PRIMARY KEY,
LastLoggedIn    DATETIME2(0) NOT NULL INDEX,
Dummy1  VARCHAR(30),
Dummy2  INT
)

I insert ~57 M dummy records, then selected a LastLoggedIn date such that 35,000 records were at least that old. Deleting these records took less than a second. This is on a four-core machine with 128 GB of memory, under light load; your mileage may vary.

Of course, if you run this purge only every few weeks, you may have millions of records to delete rather than thousands (300 K records took seven seconds for me; 1.3 M records took 30 seconds). But in that scenario, it's part of your monthly maintenance window rather than daily upkeep, and a brief delay should be acceptable.

If you get poor performance with just an index, then sure, consider partitioning. As others have pointed out, there will be a significant cost to UPDATE operations, a greater overhead than maintaining an index. You will have many more updates than purges, and the latter can happen during relatively idle periods, so I would worry about the updates more.

If you need always-on uptime, and the delete locks the table for an unacceptable length of time, that might justify using partitioning, but even there I'd look for alternatives. You could SELECT a list of old users and then delete them one at a time (confirming their LastLoggedIn dates first of course) with a cursor, for example.

I'm an MS SQL guy, so my apologies if any of this is incorrect for PostgreSQL.

Best Answer

Related Solutions

PostgreSQL strategies to deal with disk filling up

PostgreSQL Database Design – Need Help Designing a Database

Related Question