Idempotent PostgreSQL Schema Dump for Existing DB

ddlpg-dumppostgresqlpostgresql-9.6replication

I'm looking for a way to replicate a PostgreSQL schema to a database which may already have some of it defined. I wonder if there's a simple way to produce DDL that is idempotent (i.e. that can be applied any number of times) from PostgreSQL.
Currently I obtain DDL from my source DB using pgdump -s, and manipulate it using the ad-hoc Python code below. I wonder if there's a cleaner solution.

import re

def MakeDdlIdempotent(sql):
    "Make PGSQL DDL idempotent by changing various 'create' statements to idempotent versions"
    ddlReplacements = {
        "CREATE SCHEMA":   "CREATE SCHEMA IF NOT EXISTS",
        "CREATE TABLE":    "CREATE TABLE IF NOT EXISTS",
        "CREATE INDEX":    "CREATE INDEX IF NOT EXISTS",
        "CREATE VIEW":     "CREATE OR REPLACE VIEW",
        "CREATE FUNCTION": "CREATE OR REPLACE FUNCTION"
        }

    for orig, replacement in ddlReplacements.items():
        sql = sql.replace(orig, replacement)

    addConstraintRe = re.compile(r"ALTER TABLE [^;]+ ADD CONSTRAINT [^;]+;", flags = re.DOTALL | re.MULTILINE)
    sql = addConstraintRe.sub(
        r"""DO $$
BEGIN
  BEGIN
    \g<0>
    EXCEPTION
        WHEN invalid_table_definition OR duplicate_object OR duplicate_table THEN
    END;
END $$;""",
    sql)

    return sql

Best Answer

I wonder if there's a simple way to produce DDL that is idempotent

There is not, and it doesn't make much sense for a schema with data in it. If one column goes away and another appears, how should such a tool know whether it's an ALTERed column that's had its type changed and been renamed, or a DROP + a ADD? It could be either. The tool would have no way to deduce what the expression used to move the data over, if any, was either.

What you should generally do is use a schema management tool that can create versioned "migration" files for you, like liquibase.