I have production and staging RDS instances on amazon, and staging's data is a direct copy of production so both instances have duplicate data.
Doing a EXPLAIN ANALYZE SELECT * from my_table WHERE my_col=true;
resulted in this:
Seq Scan on my_table (cost=0.00..142,775.73 rows=1 width=1,436) (actual time=18,170.294..18,170.294 rows=0 loops=1) Filter: my_col Rows Removed by Filter: 360275
Where as in production, it was:
Seq Scan on my_table (cost=0.00..62,145.88 rows=1 width=1,450) (actual time=282.487..282.487 rows=0 loops=1) Filter: my_col Rows Removed by Filter: 366442
When running select pg_total_relation_size('my_table'::regclass);
I found that staging's size was almost double of production. From what I've read, I see that postgresql's MVCC is responsible for this as it keeps multiple versions of rows around. I manually ran VACUUM FULL
and afterwards saw that staging's size had been cut down by 2/3. Running that same explain analyze now shows:
Seq Scan on my_table (cost=0.00..56094.75 rows=1 width=1436) (actual time=1987.340..1987.340 rows=0 loops=1) Filter: my_col Rows Removed by Filter: 360287 Total runtime: 1987.547 ms
Which is great– but what I don't understand is, the documentation suggests that auto vacuum should kick in and be cleaning up these dead rows, yet clearly that was not happening.
I've read several places talk about "don't let your indexes get bloat", and I don't quite understand 1) how an index gets bloat, and 2) how to prevent an index from getting bloat.
How can I prevent this from happening again in the future?
UPDATE
Here are my autovacuum settings:
name | setting | unit | category | short_desc | extra_desc | context | vartype | source | min_val | max_val | enumvals | boot_val | reset_val | sourcefile | sourceline
-------------------------------------+-----------+------+------------+-------------------------------------------------------------------------------------------+------------+------------+---------+---------+-----------+------------+----------+-----------+-----------+------------+------------
autovacuum | on | | Autovacuum | Starts the autovacuum subprocess. | | sighup | bool | default | | | | on | on | |
autovacuum_analyze_scale_factor | 0.1 | | Autovacuum | Number of tuple inserts, updates, or deletes prior to analyze as a fraction of reltuples. | | sighup | real | default | 0 | 100 | | 0.1 | 0.1 | |
autovacuum_analyze_threshold | 50 | | Autovacuum | Minimum number of tuple inserts, updates, or deletes prior to analyze. | | sighup | integer | default | 0 | 2147483647 | | 50 | 50 | |
autovacuum_freeze_max_age | 200000000 | | Autovacuum | Age at which to autovacuum a table to prevent transaction ID wraparound. | | postmaster | integer | default | 100000000 | 2000000000 | | 200000000 | 200000000 | |
autovacuum_max_workers | 3 | | Autovacuum | Sets the maximum number of simultaneously running autovacuum worker processes. | | postmaster | integer | default | 1 | 8388607 | | 3 | 3 | |
autovacuum_multixact_freeze_max_age | 400000000 | | Autovacuum | Multixact age at which to autovacuum a table to prevent multixact wraparound. | | postmaster | integer | default | 10000000 | 2000000000 | | 400000000 | 400000000 | |
autovacuum_naptime | 60 | s | Autovacuum | Time to sleep between autovacuum runs. | | sighup | integer | default | 1 | 2147483 | | 60 | 60 | |
autovacuum_vacuum_cost_delay | 20 | ms | Autovacuum | Vacuum cost delay in milliseconds, for autovacuum. | | sighup | integer | default | -1 | 100 | | 20 | 20 | |
autovacuum_vacuum_cost_limit | -1 | | Autovacuum | Vacuum cost amount available before napping, for autovacuum. | | sighup | integer | default | -1 | 10000 | | -1 | -1 | |
autovacuum_vacuum_scale_factor | 0.2 | | Autovacuum | Number of tuple updates or deletes prior to vacuum as a fraction of reltuples. | | sighup | real | default | 0 | 100 | | 0.2 | 0.2 | |
autovacuum_vacuum_threshold | 50 | | Autovacuum | Minimum number of tuple updates or deletes prior to vacuum. | | sighup | integer | default | 0 | 2147483647 | | 50 | 50 | |
Best Answer
Auto-vacuuming should eventually get around to cleaning it up (assuming you haven't disabled it), but it may not be getting around to it soon enough for your purposes. There are many settings which can control auto-vacuuming and how/when it's done, which may be of interest: here and here.
This can be especially true of tables with high churn. That is, tables with lots of insertions and deletions. Long-running and idle transactions can also be a factor here, as MVCC will kick in and prevent the dead tuples from being reclaimed. The fact that manually doing a
VACUUM
frees the dead tuples suggests that this isn't the case for you, though, and it may be the former issue instead.In general, it's not recommended to do a
VACUUM FULL
, as that takes out an exclusive table lock, particularly when most rows in a table have been updated/deleted.From the doc:
Is your usage pattern such that this would be the case? You did mention a "direct copy" was involved, but it's not clear exactly how that's being done.
I have had cases with high-churn tables where the default auto-vacuum rate just wasn't enough, and even relatively small amounts of dead tuples would greatly affect the query speed (this was in a large table which was queried very often and where the query needed to be extremely fast, and as such, was highly affected by dead tuples).
To help with this, I setup a manual
VACUUM ANALYZE
of the table (so it will both free up the tuples and aid the query planner by updating the stats) in a cron job that was set to run every 5 minutes. Since there weren't that many dead tuples, theVACUUM
was pretty fast, and the constant vacuuming keeps the dead tuple count low enough so as to keep queries of that table fast.Edit in response to comment from OP:
In the VACUUM doc, it says that:
The doc then says that (emphasis mine):
So it definitely reclaims dead tuples.