PostgreSQL Update and Delete are not (logically) replicated while having REPLICA IDENTITY

postgresqlreplication

I was trying to sync a table with Logical Replication. This needs to cover Insert, Update and Delete. However the replication only worked for Insert and not the rest.

As documented here

A published table must have a “replica identity” configured in order
to be able to replicate UPDATE and DELETE operations, so that
appropriate rows to update or delete can be identified on the
subscriber side.

Here's the schema of the publishing table

CREATE TABLE mytest.point_a (
gid serial NOT NULL PRIMARY KEY,   
the_value numeric(10,2),   
name TEXT NOT NULL,   
geom geometry(Point,4326) ); 

CREATE INDEX test_point_a_geom_idx 
ON mytest.point_a USING gist (geom); 
ALTER TABLE mytest.point_a REPLICA IDENTITY DEFAULT;  

The above table I was trying to replicate has the REPLICA IDENTITY clause and a serial integer column as Primary Key.
The corresponding column in the subscribing table is just a plain bigint without Sequence.

Here's the subscribing table:

CREATE TABLE mytest.point_a (   
gid bigint,
--column order is different, this is on purpose       
name TEXT,   
address TEXT, --not exists in publishing 
geom geometry, --more flexible than the one in publishing
the_value(10,2) 
);
--this one does not have index

All inserts are replicated but Updates and Deletes are not.

So what's missing here? Did I add REPLICA IDENTITY incorrectly?

Another relevant question: How do we know if a (publishing) table has REPLICA IDENTITY clause? In pgAdmin III and DBeaver, the DDL view of my publishing table in the above do not show any hint about it.

Best Answer

Add a primary key constraint on gid on the replica so that rows can be identified.

The default value for REPLICA IDENTITY of a table is documented with ALTER TABLE:

This option has no effect except when logical replication is in use. DEFAULT (the default for non-system tables) records the old values of the columns of the primary key, if any. USING INDEX records the old values of the columns covered by the named index, which must be unique, not partial, not deferrable, and include only columns marked NOT NULL. FULL records the old values of all columns in the row. NOTHING records no information about the old row.

That is, the default setting logs enough data under the assumption that the replica has the same primary key as the source table.

If you don't have a primary key or unique index on the table, you could use REPLICA IDENTITY FULL so that a row is identified by all its columns, but if you have duplicate rows, you will still get in trouble (deleting one of them on the primary will delete all on the standby). I don't think that is a smart thing to do, and all tables should have a primary key.