I'm using PostgreSQL 9.1 on Ubuntu. Are scheduled VACUUM ANALYZE
still recommended, or is autovacuum enough to take care of all needs?
If the answer is "it depends", then:
- I have a largish database (30 GiB compressed dump size, 200 GiB data directory)
- I do ETL into the database, importing close to 3 million rows per week
- The tables with the most frequent changes are all inherited from a master table, with no data in the master table (data is partitioned by week)
- I create hourly rollups, and from there, daily, weekly and monthly reports
I'm asking because the scheduled VACUUM ANALYZE
is impacting my reporting. It runs for more than 5 hours, and I've had to kill it twice this week, because it was impacting regular database imports. check_postgres
doesn't report any significant bloat on the database, so that's not really an issue.
From the docs, autovacuum should take care of transaction ID wrap around as well. The question stands: do I still need a VACUUM ANALYZE
?
Best Answer
VACUUM is only needed on updated or deleted rows in non-temporary tables. Obviously you're doing lots of INSERTs but it's not obvious from the description that you're also doing lots of UPDATEs or DELETEs.
These operations can be tracked with the
pg_stat_all_tables
view, specifically then_tup_upd
andn_tup_del
columns. Also, even more to the point, there is an_dead_tup
column that tells, per table, how much rows need to be vacuumed. (see Monitoring statistics in the doc for functions and views related to statistics gathering).A possible strategy in your case would be to suppress the scheduled VACUUM, keeping an eye on this view and checking on which tables the
n_dead_tup
is going up significantly. Then apply the aggressive VACUUM to these tables only. This will be a win if there are large tables whose rows never get deleted nor updated and the aggressive VACUUM is really necessary only on smaller tables.But keep running the ANALYZE for the optimizer to always have fresh statistics.