Postgresql – Stored procedure deadlocking itself

deadlockpostgresqlpostgresql-9.2trigger

I have a strange situation, seen from the log:

Process 37278 waits for ExclusiveLock on advisory lock [16421,999999,12864385,2]; blocked by process 53807. 
Process 53807 waits for ExclusiveLock on advisory lock [16421,999999,12864395,2]; blocked by process 37278. 
Process 37278: SELECT * FROM zel_api.insert_event ($1 ,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24) 
Process 53807: SELECT * FROM zel_api.insert_event ($1 ,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24)",
"See server log for query details.",,,"SQL statement ""SELECT pg_advisory_xact_lock(999999, format('zel_event.%I', p_table_name)::regclass::oid::integer)""

This is already strange in itself, as it looks like two processes block the same advisory lock but neither can actually grab it.

The function which tries to acquire the lock is the following:

CREATE OR REPLACE FUNCTION zel_event.create_new_partition(
    p_table_name text
)
RETURNS void AS
$BODY$
DECLARE
    _schema text;
BEGIN
    IF NOT EXISTS (table from catalog)
    THEN
        PERFORM pg_advisory_xact_lock(999999, format('zel_event.%I', p_table_name)::regclass::oid::integer);

        IF NOT EXISTS (table from catalog)
        THEN
            EXECUTE format('
                CREATE TABLE %1$I.%2$I (
                    LIKE zel_event.%2$I INCLUDING ALL
                ) INHERITS (zel_event.%2$I)', _schema, p_table_name);
        END IF;
    END IF;
    RETURN;

END;
$BODY$
LANGUAGE plpgsql
SECURITY DEFINER;

To my eyes none of the 'classic' deadlock causes stands here as there is only one logic…
The idea behind it is to create the new table on demand where the demand can come at a really high frequency from several connections. It is called when an insertion function does the insert aimed at the parent table, being diverted by a dispatcher trigger to the appropriate child table.

This is how the transaction looks like:

INSERT INTO parent (zel_api.insert_event())
        |
     trigger
        |
        +----> child exists? ---no---> CREATE TABLE child
                      |                         |
                     yes                        |
                      |                         |
                      V                         V
              INSERT INTO child (zel_event.create_new_partition())   

The PostgreSQL version is 9.2

It would be very interesting to have some insights how this can happen (and mitigated, of course).

Best Answer

It's not the same advisory lock. Your log mentions a different table oid (off by 10) for the first process as the second. This looks like a classic deadlock situation where two processes are seeking the same locks just in a different order. The third number in the first is 12864385 while the third number in the second is 12864395.

So my suspicion is that both processes are trying to create the same partitions, just in a different order and so they deadlock.

One thing I have found helpful in puzzling situations like this is to go through and compare one digit at a time all the way through the series of numbers because with advisory locks it is way too easy to miss one digit subtly off like this.