Postgresql – postgres partition trigger and checking for child table

partitioningpostgresqltrigger

I found some sample code for creating a partition table.

Basically, setup the master table and partition on date field concatenated with the type of entity it is then, inherit from the master.

The first call of the trigger is fine, but subsequent ones fail because the 'if' condition to check for an existing child table yields false but upon trying to create the table fails for an already existing table.

I have tried 3 ways to check for a child tables existence and none yield correct answers.

These 2 statement demonstrate the contradiction:

SELECT count(*) FROM pg_class WHERE relname='DRIVER_2016_12_28';
count(*)
0

select count(*) from DRIVER_2016_12_28;
count(*)
1

Master table:

CREATE TABLE partition_test
(
  pk SERIAL PRIMARY KEY NOT NULL,
  id VARCHAR(36) NOT NULL,
  organizationid VARCHAR(36) NOT NULL,
  lat REAL NOT NULL,
  lon REAL NOT NULL,
  basetype VARCHAR(16) NOT NULL,
  name VARCHAR(64) NOT NULL,
  updatetimestamp TIMESTAMP NOT NULL
) WITHOUT OIDS;

trigger code:

CREATE OR REPLACE FUNCTION telemetry_insert_trigger() RETURNS trigger AS
$BODY$
DECLARE
  partition_date TEXT;
  partition TEXT;
BEGIN
  partition_date := to_char(NEW.updatetimestamp,'YYYY_MM_DD');
  partition := NEW.baseType || '_' || partition_date;

  IF NOT EXISTS(SELECT relname FROM pg_class WHERE relname=partition) THEN
    EXECUTE 'CREATE TABLE ' || partition || ' (check (updatetimestamp = ''' || NEW.updatetimestamp || ''')) INHERITS (' || TG_RELNAME || ');';
    EXECUTE 'CREATE INDEX ON ' || partition || ' USING HASH (id)';
    EXECUTE 'CREATE INDEX ON ' || partition || ' USING HASH (organizationid)';
    EXECUTE 'CREATE INDEX ON ' || partition || ' (lat)';
    EXECUTE 'CREATE INDEX ON ' || partition || ' (lon)';
    EXECUTE 'CREATE INDEX ON ' || partition || ' (updatetimestamp)';
    RAISE NOTICE 'A partition has been created %',partition;
  END IF;
  EXECUTE 'INSERT INTO ' || partition || ' SELECT(' || TG_RELNAME || ' ' || quote_literal(NEW) || ').* RETURNING pk;';
  RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;

CREATE TRIGGER telemetry_partition_insert_trigger
BEFORE INSERT ON partition_test
FOR EACH ROW EXECUTE PROCEDURE telemetry_insert_trigger();

insert example:

INSERT into partition_test (id, organizationid, lat, lon, basetype, name, updatetimestamp) VALUES ('123','456',12.0,12.0,'DRIVER','SomeName',now());

first output

[2016-12-28 12:15:12] [00000] A partition has been created
DRIVER_2016_12_28 [2016-12-28 12:15:12] completed in 33ms

second:

INSERT into partition_test (id, organizationid, lat, lon, basetype, name, updatetimestamp) VALUES ('123','456',12.0,12.0,'DRIVER','SomeName',now());

[2016-12-28 12:16:15] [42P07] ERROR: relation "driver_2016_12_28"
already exists [2016-12-28 12:16:15] Where: SQL statement "CREATE
TABLE DRIVER_2016_12_28 (check (updatetimestamp = '2016-12-28
12:16:15.467012')) INHERITS (partition_test);" [2016-12-28 12:16:15]
PL/pgSQL function telemetry_insert_trigger() line 11 at EXECUTE
statement

The question: How can I effectively check for the existing of the child table so that I don't get this error? If you'd like to point out existing problems with my trigger, that'd be great too

Best Answer

I would advise to use always quote_ident when generating dynamic sentences, to make sure your idents don't have upper/lowercase problems, or that you can use identifiers with spaces or special chars, and that you avoid SQL injection if your idents are parameters out of your control.

That is, your trigger function should look like:

CREATE OR REPLACE FUNCTION telemetry_insert_trigger() RETURNS trigger AS
$BODY$
DECLARE
  partition_date TEXT;
  partition TEXT;
  start_ts TIMESTAMP ;
  end_ts TIMESTAMP ;
BEGIN
  partition_date := to_char(NEW.updatetimestamp,'YYYY_MM_DD');
  partition := NEW.baseType || '_' || partition_date;
  start_ts := partition_date::timestamp ;
  end_ts := start_ts + interval '1 day' ;

  IF NOT EXISTS(SELECT relname FROM pg_class WHERE relname=partition) THEN
    EXECUTE 'CREATE TABLE '    || quote_ident(partition) 
        || ' (check (updatetimestamp >= ''' || start_ts || ''' ' 
        || '     AND updatetimestamp <  ''' || end_ts || ''')) ' 
        || ' INHERITS (' || TG_RELNAME || ');';
    EXECUTE 'CREATE INDEX ON ' || quote_ident(partition) || ' USING HASH (id)';
    EXECUTE 'CREATE INDEX ON ' || quote_ident(partition) || ' USING HASH (organizationid)';
    EXECUTE 'CREATE INDEX ON ' || quote_ident(partition) || ' (lat)';
    EXECUTE 'CREATE INDEX ON ' || quote_ident(partition) || ' (lon)';
    EXECUTE 'CREATE INDEX ON ' || quote_ident(partition) || ' (updatetimestamp)';
    RAISE NOTICE 'A partition has been created %', partition;
  END IF;
  EXECUTE 'INSERT INTO ' || quote_ident(partition) || ' SELECT(' || TG_RELNAME || ' ' || quote_literal(NEW) || ').* RETURNING pk;';
  RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;

Check a usage example for quote_literal on the PostgreSQL documentation.

As a matter of preference, I prefer to have the names of all tables lowercase and with no special chars (so that I don't have "Table Names with Double Quotes", but table_names_without_quotes), because I think the code is less noisy and more readable. But, again, this is a matter of taste.