Postgresql – Postgres 9.4.5 small table grows into huge size – periodic VACUUM FULL needed

postgresqlpostgresql-9.4vacuum

We are running a PostgreSQL database on Windows Server. The database version is 9.4.5 (it's been running stable for a long time now so we aren't updating it).

Some of our tables have millions of rows, and they are running fine.

What causes problems is a tiny little table that is only ~2000 rows.
This table is used to store current status reported from remote devices. These devices are sending their updated status every few seconds, so there are quite a few updates in that database each second. Nothing is being removed though, the rows are kept intact they are just being updated every time a device reports new status.

Some time ago I have noticed huge performance drop on our Apache server that uses this database. What I saw was that the status table was >3GB in size while having only 2000 rows. I read a lot on the internet, tried normal VACUUMs but they didn't do anything.

I finally went for VACUUM FULL as I saw that it is safe to use it on versions 9.0 and above. That really helped, the table size dropped just to a few KBs!

However, the problem is happening periodically every few days, and now when I try to run VACUUM FULL it takes a very long time, and our whole service is unusable during that time.

What could I do to prevent this from happening? Autovacuum is turned on, but it seems to do nothing in this particular case.

Output of VACUUM VERBOSE:

username=# vacuum verbose status;
INFO:  vacuuming "public.status"
INFO:  index "status_pkey" now contains 3295731 row versions in 11704 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.07s/0.01u sec elapsed 0.14 sec.
INFO:  "status": found 0 removable, 4472204 nonremovable row versions in 400557 out of 400558 pages
DETAIL:  4470176 dead row versions cannot be removed yet.
There were 23081 unused item pointers.
0 pages are entirely empty.
CPU 2.87s/0.70u sec elapsed 3.64 sec.
INFO:  vacuuming "pg_toast.pg_toast_16753"
INFO:  index "pg_toast_16753_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_16753": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

As you can see, there are 0 removable row versions, despite the table has only 2039 physical rows.

SELECT n_live_tup, n_dead_tup
from pg_stat_user_tables
where relname = 'status';

 n_live_tup | n_dead_tup
------------+------------
       2039 |    4473416

At that moment,

SELECT pg_size_pretty(pg_total_relation_size('status'))

reported a size of 3221 MB, for only 2039 rows in the table. That is hilarious.

Best Answer

Something is keeping autovacuum from removing those dead tuples.

These can be only three things that keep VACUUM from doing its job:

  1. A long running concurrent transaction. Check with

    SELECT pid, datname, usename, state, backend_xmin
    FROM pg_stat_activity
    WHERE backend_xmin IS NOT NULL
    ORDER BY age(backend_xmin) DESC;
    
  2. An abandoned replication slot. Check with

    SELECT slot_name, slot_type, database, xmin
    FROM pg_replication_slots
    ORDER BY age(xmin) DESC;
    
  3. An orphaned prepared transaction. Check with

    SELECT gid, prepared, owner, database, transaction AS xmin
    FROM pg_prepared_xacts
    ORDER BY age(transaction) DESC;
    

But all that would also block VACUUM (FULL) from shrinking the table. So there must be something extra you are doing when you run VACUUM (FULL).

You have autovacuum enabled, right?

This is what you should do:

  1. Remove whatever is blocking VACUUM.

  2. Make sure autovacuum runs fast enough on that table:

    ALTER TABLE status SET (autovacuum_vacuum_cost_delay = 0);
    

If the data modifications are caused by UPDATE you could also try to get HOT updates. These remove the need for VACUUM. A prerequisite is that the updated column or columns are not indexed.