Auto-vacuuming should eventually get around to cleaning it up (assuming you haven't disabled it), but it may not be getting around to it soon enough for your purposes. There are many settings which can control auto-vacuuming and how/when it's done, which may be of interest: here and here.
This can be especially true of tables with high churn. That is, tables with lots of insertions and deletions. Long-running and idle transactions can also be a factor here, as MVCC will kick in and prevent the dead tuples from being reclaimed. The fact that manually doing a VACUUM
frees the dead tuples suggests that this isn't the case for you, though, and it may be the former issue instead.
In general, it's not recommended to do a VACUUM FULL
, as that takes out an exclusive table lock, particularly when most rows in a table have been updated/deleted.
From the doc:
The FULL option is not recommended for routine use, but might be
useful in special cases. An example is when you have deleted or
updated most of the rows in a table and would like the table to
physically shrink to occupy less disk space and allow faster table
scans. VACUUM FULL will usually shrink the table more than a plain
VACUUM would.
Is your usage pattern such that this would be the case? You did mention a "direct copy" was involved, but it's not clear exactly how that's being done.
I have had cases with high-churn tables where the default auto-vacuum rate just wasn't enough, and even relatively small amounts of dead tuples would greatly affect the query speed (this was in a large table which was queried very often and where the query needed to be extremely fast, and as such, was highly affected by dead tuples).
To help with this, I setup a manual VACUUM ANALYZE
of the table (so it will both free up the tuples and aid the query planner by updating the stats) in a cron job that was set to run every 5 minutes. Since there weren't that many dead tuples, the VACUUM
was pretty fast, and the constant vacuuming keeps the dead tuple count low enough so as to keep queries of that table fast.
Edit in response to comment from OP:
In the VACUUM doc, it says that:
VACUUM reclaims storage occupied by dead tuples
The doc then says that (emphasis mine):
VACUUM ANALYZE performs a VACUUM and then an ANALYZE for each selected
table. This is a handy combination form for routine maintenance
scripts. See ANALYZE for more details about its processing.
So it definitely reclaims dead tuples.
The table itself is only locked in ROW EXCLUSIVE mode, which shouldn't prevent any normal operations on the table, only things like DROP, ALTER, and CREATE INDEX.
Each individual row that is being deleted will be locked for the duration. This should only block other processes if those other processes are trying to update the rows (or delete them themselves). Normally with bulk deletions you are deleting rows that no one else cares about, which means no one else will be trying to update them.
If you really need to unlock all the deleted rows and the table periodically, then you need to do each loop of the deletions in a separate transaction. There is no easy way to do this in a DO block because the entire block runs in a single transaction. So you should put your loop outside of PostgreSQL, like in bash, perl, python, etc.
Best Answer
Autovacuum will start right away, but it takes some time to process the table. During that time, you don't see a decrease in the number of dead tuples.
To make autovacuum be more aggressive, don't make it start sooner, but make it faster. A threshold of 10000 with a scale factor of 0 seems insane to me, it will cause autovacuum to permanently process this table (at a snail's pace), and you don't end up with significantly less bloat.
Every table needs a certain amount of bloat, and the 20% you have by default are typically fine.
So I recommend these settings:
That will cause autovacuum to process the table as fast as possible, and you won't end up with much more than 20% bloat. Autoanalyze will run very frequently.
If you really don't want 20% of bloat in spite of what I said, consider setting
autovacuum_vacuum_scale_factor
to 0.1. I wouldn't go lower than that.