Postgresql – after deleting 100 millions of records in postgresql free space doesn’t change

deletepostgresql

Recently I configured netflow for monitoring traffic of my company. I use pmacct to collect netflow packets and PostgreSQL for storing them.

I only keep netflow records for a week and after that every day I delete all records that older than 7 days.

this script runs every day with cronjob:

date >> /root/log/cronlog

du -h /var/lib/postgresql/data/base/ >> /root/log/cronlog

psql -U pmacct -d pmacct_switch -c "DELETE FROM acct_v9 WHERE stamp_updated < NOW() - INTERVAL '7 days';" >> /root/log/cronlog
psql -U pmacct -d pmacct_switch -c "VACUUM;" >> /root/log/cronlog

du -h /var/lib/postgresql/data/base/ >> /root/log/cronlog

date >> /root/log/cronlog

and this is the cronlog file:

Wed Jul 8 01:00:02 +0430 20207.4M

/var/lib/postgresql/data/base/17.5M

/var/lib/postgresql/data/base/1299480G

/var/lib/postgresql/data/base/167654.0K

/var/lib/postgresql/data/base/pgsql_tmp7.4M

/var/lib/postgresql/data/base/129936.5M

/var/lib/postgresql/data/base/1638480G

/var/lib/postgresql/data/base/

DELETE 100424891
VACUUM

7.4M

/var/lib/postgresql/data/base/17.5M

/var/lib/postgresql/data/base/1299481G

/var/lib/postgresql/data/base/167654.0K

/var/lib/postgresql/data/base/pgsql_tmp7.4M

/var/lib/postgresql/data/base/129936.5M

/var/lib/postgresql/data/base/1638481G 
/var/lib/postgresql/data/base/Wed Jul 8 02:29:33 +0430 2020

Ss you see after deleting 100 millions of records free space doesn't increase!

So, what's the solution?

Best Answer

The documentation describes that well:

The standard form of VACUUM removes dead row versions in tables and indexes and marks the space available for future reuse. However, it will not return the space to the operating system, except in the special case where one or more pages at the end of a table become entirely free and an exclusive table lock can be easily obtained. In contrast, VACUUM FULL actively compacts tables by writing a complete new version of the table file with no dead space. This minimizes the size of the table, but can take a long time. It also requires extra disk space for the new copy of the table, until the operation completes.

For an application where you have to delete lots of rows regularly, you might consider partitioning. If you can partition the table in a way so that you can drop partitions rather than running DELETE, you won't have no bloat problem any more.