Postgresql – Auto vacuuming high-write, high-update, and mostly read table types

amazon-rdsautovacuummaintenancepostgresql

What are good auto vacuum settings (recommendations) for tables like:

  • High Write Table Insert load

    Range between 30-10,000 inserts in a day. The table can idle for weeks without load, but can get bursts of inserts at least three times a week.

  • High Update Table

    It uses partition table data, is 3-8 times the size of my table from a single insert.

  • High Write Table

    A single row gets updated only once, but bursts of unique key updates in a day
    and needs to be updated it could be 30-10,000 key update.

  • High Read Table

    Most tables are high read tables set to fill factor 80 for my data warehouse, housing table that came from the computation of High Update Table

My deletes happen monthly and in batches. Everything relating to the key gets deleted or moved as backup.

Currently my fill factor is set to be 10-20 for high update table.

Using TDS db.t3.large but I switch to db.t3.micro during low traffic.

Also, does setting fill factor really low slow down selects?

Best Answer

That question is too broad, but here are some pointers about configuring autovacuum:

  • For tables that receive bulk inserts, run an explicit VACUUM afterwards or use PostgreSQL v13.

  • For tables with many updates, a fillfactor of 70 to 90 (depending on row size) is a good thing if no updated columns are indexed. 10 or 20 is just a crazy waste of space.

  • There is no need to care about autovacuum for tables that are only read.

  • With mass deletes, there is nothing much you can do. If you can use partitioning, that pain may vanish completely.

Of course low fillfactor will have a negative impact on query performance; that is the price you are paying for more efficient data modifications:

  • For sequential scans, the impact is clear: you have to read all that empty space too.

  • For index scans that read only a single row, there will be no performance impact.

  • Index scans that read several rows are somewhere in the middle, since they will have to read more blocks if fillfactor is low, because the rows will be spread across more blocks.

  • Don't forget the impact on caching efficiency: If your blocks consist mostly of air, the RAM used for caching will also mostly contain dead space.