Postgresql – Is Postgres insert performance the same in the replica

indexperformancepostgresqlreplication

We've got a Postgres database with a table that takes heavy select, update and inserts that needs some tuning (it has too many indexes).

We have a read replica, and I'm wondering if moving all of our SELECT queries over to the replica will be a quick win while we sort out the indexes performance. (Our application can be switched over to a replica fairly easily while sorting the indexes is going to take longer).

My understanding is that pointing read operations at the replica probably won't improve performance in this case, but wanted to check my understanding.

My understanding is that:

  1. INSERT and UPDATE operations will return once they're committed to memory (which is then written to the WAL asynchronously), so they'll be not much faster.
  2. The replica will stream the WAL and will have to update it's own copy of the tables and indexes. So SELECT operations on the replica will have to contend with the same IO and CPU demands that those updates impose on the primary, so will gain no performance.

If our app were simply streaming inserts, then I could see a performance improvement for the inserts, but in practice since the clients are performing SELECT/INSERT together, I can't see a real performance gain here in this use case?

Is this correct, or have I missed something?

Best Answer

My understanding is that:

INSERT and UPDATE operations will return once they're committed to memory (which is then written to the WAL asynchronously), so they'll be not much faster.

This sounds like a description of sychronous_commit=off, but that is not the default value for that setting, and you don't describe having changed it.

In any event, modifications can involve calculations and the consultation of any number of rows other than the ones being modified. Having to share CPU and disks with concurrent heavy selects could certainly mutually slow the modifications down. If all your updates are just by pk and setting columns to a literal, and all your inserts are just of a list of literals, and you don't have fk constraints, maybe that is not a concern.

The replica will stream the WAL and will have to update it's own copy of the tables and indexes. So SELECT operations on the replica will have to contend with the same IO and CPU demands that those updates impose on the primary, so will gain no performance.

Replaying the WAL might be less IO than generating it was. The replica doesn't have to verify fk constraints. If the update was HOT (heap only tuple) then the replica doesn't need to consult any indexes, while the original work probably had to consult at least one index in order to identify the tuple to update. And the replica doesn't need to do anything with tuples that the primary considered but then filtered out before modifying.

Accurately predicting how much benefit you might get by redirecting reads is probably at least as hard as just optimizing your indexes will be.