PostgreSQL – Slow Queries and Autovacuum Frequency

performancepostgresqlpostgresql-9.1postgresql-performance

We've noticed the performance of our platform drop in recent weeks so I've run the following:

select relname, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze 
from pg_stat_user_tables 
where relname like 'core_%';

And noticed that our primary table hadn't been autovacuumed for more than a week. So last week I ran:

vacuum analyse verbose TABLENAME

Which seemed to help, but we've run into the same issue again now. Upon closer inspection, a lot of the tables have either never been analysed (auto or otherwise) and aside from the manual vacuum analyse run last week, none of the tables have been manually vacuumed, and a lot of the other tables haven't been autovacuumed since, at best, a few days ago, and at worse a few weeks ago.

My understanding of the terms is as follows:

  • vacuum: Clears out deleted records from disk
  • analyse: Updates the
    query planner

In postgres.conf, the autovacuum property is commented out, but the documentation states that this is on by default, so my presumption is that even though it's commented out, it should still be on?

Can someone explain why the tables wouldn't be getting vacuumed and analysed frequently and more specifically, would these values not getting updated actually have that much of an impact on the system?

Info:
Postgres 9.1
OS: Ubuntu 12.04

Output of

SELECT relname as "Table",
pg_size_pretty(pg_total_relation_size(relid)) As "Size",
pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as "External Size"
FROM pg_catalog.pg_statio_user_tables 
ORDER BY pg_total_relation_size(relid) DESC;


     Table       | Size | External Size
-----------------+------+---------------
"Primary Table"  | 27G  |     8232M

Best Answer

vacuum: Clears out deleted records from disk

Not exactly. VACUUM marks rows that aren't visible to any active transaction any more as dead (and ready for re-use). It does not shrink the physical file size that represents the table, except for completely dead / empty pages at the physical end. The manual explains everything and probably better than I could recap here.

analyse: Updates the query planner

It's officially ANALYZE, with Z, but ANALYSE is accepted as alternative spelling, too. And it updates statistics used by the query planner. Again, the manual already provides the best explanation.

In postgres.conf ... my presumption is that even though it's commented out, it should still be on?

That's correct. Again, consider details in the manual.

With that many write operations (a few thousand records a minute) your system is in bad need of regular VACUUM / ANALYZE runs. You have read the manual by now, so you understand the consequences. If your table ...

hadn't been autovacuumed for more than a week.

... then that's bad for multiple reasons. Also consider @Daniel's answer how this may have transpired with a huge table like yours. Or maybe the high load constantly locks the table and never lets VACUUM do its work. Again, it's all documented in the manual. Here is a related case with good answers how to tune settings:

Remember that you can have per-table settings (STORAGE parameters) to fine tune for special needs of a special table and leave the rest of the system alone.

If you are mostly updating recently inserted rows, a FILLFACTOR below 100 may be very helpful. You can compact the table (once) with CLUSTER or pg_repack an then set the FILLFACTOR below 100. And for huge tables it may also help to set a higher STATISTICS target for key columns with irregular data distribution.

Also, if old rows are updated seldom, partitioning might be a good solution, to treat old sections differently. It really depends on the complete picture ...

Also, don't forget indices, those can get bloated, too. Only keep indices that you actually need.

To see a count of dead and live tuples: