PostgreSQL – Hot Standby Feedback vs. Max Standby Delay Settings for Long Running Analytical Queries

business-intelligenceperformancepostgresqlpostgresql-performancereplication

When spinning up a hot standby server specifically for BI/Analytics purposes where long running queries may be common, is it better to turn on hot_standby_feedback or set the max_standby_*_delay settings to -1?

My understanding is that the hot_standby_feedback prevents the master from doing things like VACUUM until it is safe to do likewise on the standby where the max_standby_*_delay settings allow VACUUM to commence on the master but the standby, if necessary, waits to apply any vacuum cleanup that could conflict with a long running query.

Furthermore, the docs state for hot_standby_feedback:

Remedial possibilities exist if the number of standby-query cancellations is found to be unacceptable. The first option is to set the parameter hot_standby_feedback, which prevents VACUUM from removing recently-dead rows and so cleanup conflicts do not occur. If you do this, you should note that this will delay cleanup of dead rows on the primary, which may result in undesirable table bloat. However, the cleanup situation will be no worse than if the standby queries were running directly on the primary server, and you are still getting the benefit of off-loading execution onto the standby.

And for max_standby_*_delay the docs state:

If the standby server is tasked as an additional server for decision support queries then it might be acceptable to set the maximum delay values to many hours, or even -1 which means wait forever for queries to complete.

It's still unclear to me which is better and what the exact pros and cons of each are?

Best Answer

With hot_standby_feedback on, vacuum can still be done, but it will be less effective as some tuples that would otherwise be vacuumed now have to be delayed to a later vacuum. The only real downside that I know of is the increased bloat. How much of a downside this is depends entirely on your usage. The worst case is if your database has small, intensively updated tables, like a work-queue table. Those could get extremely bloated. If you don't have that kind of table, you probably won't have a problem.

The problems with max_standby_*_delay are that other queries running on the standby are also having their horizon held back by potentially large amounts, and that if you hold back the stand-by long enough accumulated un-replayed WAL files will fill up your hard drive and you will lose the stand-by.