Postgresql – Add column to database, used with logical replication generates large WAL files

postgresqlreplication

I have a PostgreSQL 10 master database with a logical replication, but I faced with an issue.
I added one more column (int, without default value) to a big table in master (weight 39gb, over 100 million entries) within a transaction, and updated its value with a same value in another column.

begin;
alter table test add column "onecolumn" int;
update table test set "onecolumn"="secondcolumn"
commit;

Finally it generated large WAL files with weight 39gb, so Postgres replicated whole table instead of only a column.

Why Postgres generated so big WAL files? Because real weight of adding one column with int should be much less.

Replica identity of the table is set to default

Best Answer

When you update a row, a new copy of the complete row is written to the table (and to WAL). This is how PostgreSQL works.

So after the update, the table will be more than twice as big, and you have generated a lot of WAL.

One thing you can do is run the update in batches of 10 million rows and run VACUUM on the table between these batches. That won't reduce WAL (that is not possible), but it will prevent table bloat. But you cannot do that in a single transaction.

In short, everything is working as it should.