Postgresql – After bulk load, how to ensure an updated visibility map

postgresqlpostgresql-11

We have a number of tables that have an active version, an inactive version, and a view pointing at the active one. We periodically truncate the inactive table, drop the indexes, load a bunch of data into the table, re-build the indexes and redefine the view to point to the newly loaded table. This is reasonably standard data warehouse stuff.

The issue we're running into is that we are unable to use index-only scans on the freshly-loaded tables after our script runs due to the visibility map not being updated. I tried adding a VACUUM step to our script, using pg_background. I've tried both VACUUM ANALYZE and VACUUM FREEZE ANALYZE, but neither will update the visibility map if there's a transaction that started before the VACUUM.

The only workaround I have right now is to fail out of the script if pg_class.relallvisible = 0 for the backing table after the VACUUM FREEZE, but this is not ideal. Are there any other ways to force the visibility map to update?

To be clear, there will be no user activity on the inactive table until the script updates the view definition, so I'm free to do pretty much anything that I want to it, but performance is somewhat of an issue (e.g, I would not want to VACUUM FULL) and my scripts can't hold any locks that would prevent selects on other tables.

Best Answer

You can set old_snapshot_threshold and then wait out that amount of time before retrying the VACUUM.

(Note that this is a last resort. First resort would be just to identify the long running connections, and making them go away if they are not really needed, or downgrading their isolation level).