Postgresql – Does daily pg_dump mess up postgres cache

cacheperformancepg-dumppostgispostgresql

I migrated my geospatial Postgres 12.5 database to another cloud provider. I use postgis and I have around 35GB of data and 8GB of memory.

Performances are way worse than on my previous provider, and new provider claims this is because the pg cache has to been "warmed up" everyday after automatic pg_dump backuping operations occuring in the night.

Geospatial queries that would normally take 50ms sometimes take 5-10s on first request, and some that would run in 800ms take minutes.

Is there something else looming or is the technical support right ?

If so, should I disable daily backups ? Or can I somehow use a utility function to restore the cache ? (pg_prewarm ?)

Best Answer

If you don't want to have backups, then sure, disable them. Of course, then you won't have backups. We cannot tell you if your data is valuable to you or not. But why would you pay someone to host data you don't care about? Maybe you can just turn them off for a day, and see if that actually fixes the problem.

I don't think that pg_dump does anything special that should drive data out of the cache. It does access all of your table data, but PostgreSQL server has code to prevent a sequential scan of a large table from driving all other data out of cache. This should be just as effective for pg_dump as it is for anything else. But if you have a very large number of small tables, the cumulative effect of reading all of them might be enough to drive everything else out of cache, as I think the antispoliation code applies to each table individually, not all of them collectively.

pg_prewarm is certainly an option. Another would be a cron job that just runs the query in question every morning.