Postgresql – Postgres auto analyze performance

autovacuumpostgresqlpostgresql-9.4

On postgresql 9.4 I have a table that never has any updates or deletes, only inserts and selects. It therefore never gets auto vacuumed, but it does still get auto analyzed and when it does it can take over 100 seconds. For that 100 seconds and for a significant amount of time afterwards DB performance is terrible, with my app reporting db response times in the 40-150 second range instead of the normal 2ms range.

I've always heard that you shouldn't disable auto vacuum but when I set autovaccum_enabled=false for this table the db performs much more reliably. Is there a reason to have it enabled on this table? Will the query planner eventually suffer from not having auto analyze running on a table that never gets updates or deletes? Is there a better way to solve this?

EDIT: Potentially modifying autovacuum_analyze_threshold and autovacuum_analyze_scale_factor for this table to make analyze happen more frequently would be better?

Best Answer

I think your best option is to use your favorite OS scheduling tool to run ANALYZE <table> explicitly and pre-emptively at the time of your choosing, say at 3 a.m.

PostgreSQL's autovacuum and autoanalyze are activity-counter driven. This has the unfortunate effect that the time when the activity counters cross the thresholds is mostly likely to be exactly when the database is most active, which is when you least want those maintenance tasks to run. By running the task preemptively during a off-peak time you will reset the activity counters, so that the auto-versions rarely will find work to do. But in case of something going wrong, they are still there to save your bacon.

Will the query planner eventually suffer from not having auto analyze running on a table that never gets updates or deletes?

This can't be answered with the information you provide. If that table gets nothing but single-row lookups via a unique index, for example, out-of-date statistics are unlikely to be a problem. If it gets complicated analytics queries, it is likely to be much more of a problem.