PostgreSQL Update and Delete are not (logically) replicated while having REPLICA IDENTITY

postgresqlreplication

I was trying to sync a table with Logical Replication. This needs to cover Insert, Update and Delete. However the replication only worked for Insert and not the rest.

As documented here

A published table must have a “replica identity” configured in order
to be able to replicate UPDATE and DELETE operations, so that
appropriate rows to update or delete can be identified on the
subscriber side.

Here's the schema of the publishing table

CREATE TABLE mytest.point_a (
gid serial NOT NULL PRIMARY KEY,   
the_value numeric(10,2),   
name TEXT NOT NULL,   
geom geometry(Point,4326) ); 

CREATE INDEX test_point_a_geom_idx 
ON mytest.point_a USING gist (geom); 
ALTER TABLE mytest.point_a REPLICA IDENTITY DEFAULT;

The above table I was trying to replicate has the REPLICA IDENTITY clause and a serial integer column as Primary Key.
The corresponding column in the subscribing table is just a plain bigint without Sequence.

Here's the subscribing table:

CREATE TABLE mytest.point_a (   
gid bigint,
--column order is different, this is on purpose       
name TEXT,   
address TEXT, --not exists in publishing 
geom geometry, --more flexible than the one in publishing
the_value(10,2) 
);
--this one does not have index

All inserts are replicated but Updates and Deletes are not.

So what's missing here? Did I add REPLICA IDENTITY incorrectly?

Another relevant question: How do we know if a (publishing) table has REPLICA IDENTITY clause? In pgAdmin III and DBeaver, the DDL view of my publishing table in the above do not show any hint about it.

Best Answer

Add a primary key constraint on gid on the replica so that rows can be identified.

The default value for REPLICA IDENTITY of a table is documented with ALTER TABLE:

This option has no effect except when logical replication is in use. DEFAULT (the default for non-system tables) records the old values of the columns of the primary key, if any. USING INDEX records the old values of the columns covered by the named index, which must be unique, not partial, not deferrable, and include only columns marked NOT NULL. FULL records the old values of all columns in the row. NOTHING records no information about the old row.

That is, the default setting logs enough data under the assumption that the replica has the same primary key as the source table.

If you don't have a primary key or unique index on the table, you could use REPLICA IDENTITY FULL so that a row is identified by all its columns, but if you have duplicate rows, you will still get in trouble (deleting one of them on the primary will delete all on the standby). I don't think that is a smart thing to do, and all tables should have a primary key.

Related Solutions

Postgresql – Postgres multiple joins slow query, how to store default child record

You write:

Each customer can have multiple sites, but only one should be displayed in this list.

Yet, your query retrieves all rows. That would be a point to optimize. But you also do not define which site is to be picked.

Either way, it does not matter much here. Your EXPLAIN shows only 5026 rows for the site scan (5018 for the customer scan). So hardly any customer actually has more than one site. Did you ANALYZE your tables before running EXPLAIN?

From the numbers I see in your EXPLAIN, indexes will give you nothing for this query. Sequential table scans will be the fastest possible way. Half a second is rather slow for 5000 rows, though. Maybe your database needs some general performance tuning?

Maybe the query itself is faster, but "half a second" includes network transfer? EXPLAIN ANALYZE would tell us more.

If this query is your bottleneck, I would suggest you implement a materialized view.

After you provided more information I find that my diagnosis pretty much holds.

The query itself needs 27 ms. Not much of a problem there. "Half a second" was the kind of misunderstanding I had suspected. The slow part is the network transfer (plus ssh encoding / decoding, possibly rendering). You should only retrieve 100 rows, that would solve most of it, even if it means to execute the whole query every time.

If you go the route with a materialized view like I proposed you could add a serial number without gaps to the table plus index on it - by adding a column row_number() OVER (<your sort citeria here>) AS mv_id.

Then you can query:

SELECT *
FROM   materialized_view
WHERE  mv_id >= 2700
AND    mv_id <  2800;

This will perform very fast. LIMIT / OFFSET cannot compete, that needs to compute the whole table before it can sort and pick 100 rows.

pgAdmin timing

When you execute a query from the query tool, the message pane shows something like:

Total query runtime: 62 ms.

And the status line shows the same time. I quote pgAdmin help about that:

The status line will show how long the last query took to complete. If a dataset was returned, not only the elapsed time for server execution is displayed, but also the time to retrieve the data from the server to the Data Output page.

If you want to see the time on the server you need to use SQL EXPLAIN ANALYZE or the built in Shift + F7keyboard shortcut or Query -> Explain analyze. Then, at the bottom of the explain output you get something like this:

Total runtime: 0.269 ms

Sql-server – Replicating Triggers

I believe you want to mark the trigger NOT FOR REPLICATION. This will prevent the trigger from firing when the Merge Agent makes the update. Have a look at All about "Not for Replication".

Best Answer

Related Solutions

Postgresql – Postgres multiple joins slow query, how to store default child record

pgAdmin timing

Sql-server – Replicating Triggers

Related Question