I have a 14.7 TB Postgres(9.1) database that has to endure heavy writes for prolonged periods of time. I know that I need to closely manage my transaction ids to prevent db lockup. Recently I noticed the queries slowing, and saw that there were multiple autovacuums running on our massive read-only tables, with the "(to prevent wraparound)" subscript. I stopped the running software and executed a vacuumdb -F -a
command. However, when I run select current_query from pg_stat_activity
I see that the autovacuum processes are still running, even during a manual vacuum. I tried killing the autovacuums with select pg_cancel_backend(pid)
and they died but then immediately restart again. My questions:
- Is autovacuum supposed to continue to run during a manual db vacuum?
- How do I effectively get these autovacuum processes to stop?
- Why would these autovacuums continue to run on a read-only table? What is there to vacuum?
Best Answer
Autovacuum is triggered by the table statistics on the table, and as long as your manual
VACUUM (FREEZE)
is not done, these are not updated. That is why anti-wraparound autovacuum processes will still start.But that's not a big problem: Only one
VACUUM
can run on a table at any given time. Now anti-wraparound autovacuum workers won't give up when they block another process, in that case your manualVACUUM
. But if you kill the anti-wraparound autovacuum workers, your manualVACUUM
gets the lock, and now it is the restarted anti-wraparound autovacuum worker that is blocked. Look atpg_locks
to verify that they are waiting for a lock (granted = FALSE
).Now your
vacuumdb
processes one table after the other, so you must be ready to kill autovacuum workers as soon as it starts processing the next table and is blocked by a new autovacuum worker there.It might be easier to manually launch
VACUUM
on those big read-only tables rather than usingvacuumdb
, because then you can control when which table gets vacuumed.Make sure you set
maintenance_work_mem
high to speed upVACUUM
. You should also setautovacuum_vacuum_cost_delay
to 2ms or lower at least on these big tables, so that future autovacuum runs get done faster.To reduce the pain for the future, lower
autovacuum_freeze_max_age
on the affected tables substantially. Then the next anti-wraparound vacuum will kick in sooner and will be done faster.Best of all, upgrade to v13 as soon as you can, because from that version on insert-only tables will receive regular vacuum runs as well, which should do away with the problem.