PostgreSQL – Avoiding Deadlocks with Multiple Clients

I am developing software which launches multiple, concurrent clients to connect to a PostgreSQL (12) database. When each client starts, the first thing it does upon connecting to PostgreSQL is to run the schema creation script.

This script has been written idempotently — at least in principle — such that the multiple clients shouldn't trip over themselves. By in large, this works fine. However, PostgreSQL sometimes detects deadlocks and affected client(s) crash. Looking through the logging, I believe these occur under such a sequence:

Client A: Begin schema creation transaction
Client A: Finish schema creation transaction
Client B: Begin schema creation transaction
Client A: New transaction that uses schema (select from view)
- Client A and B now in deadlock

The logs aren't 100% clear and I can't reproduce this deterministically, but that seems to be what's going on: Client A is trying to SELECT from a view defined by the schema, but it's deadlocking because Client B is trying to recreate that view (CREATE OR REPLACE VIEW) in the schema script.

Is it possible to ensure that the schema creation script runs exclusively? Or, is there some other solution (e.g., rather than CREATE OR REPLACE VIEW, I only CREATE VIEW once I've determined it doesn't already exist)?

Best Answer

From the comments to my question, the problem can be summarised, per @DanielVérité, as "concurrent DDL is bad; concurrent DDL and DML is really bad". I was not aware of that -- even with the care I took to make the DDL idempotent -- but it's a good thing to know. So, that leaves me with two broad solutions:

Don't do the DDL concurrently (e.g., have a single client instantiate the schema).

This is probably the "easiest" solution from an application development perspective -- and I see it as my backup plan -- but it has consequences on how the application is architectured. I only want to take this route if I have to.
Use advisory locks, on @a_horse_with_no_name's suggestion.

This requires a minor modification to my application's database library only, so no large architectural changes or changes to the schema. Its downside is that I have to acquire and release the lock on every transaction (to make sure it isn't already locked by the DDL section of the code), which necessarily serialises all transactions across clients.

I am experimenting with option 2 and it appears to be working. I need to give it a few more runs through to convince myself that the deadlocking has gone and what performance impact the serialisation has (anecdotally, so far, it's negligible). I'll report back with results when I have them...

Best Answer

Related Solutions

Postgresql – Rotate a table in PostgreSQL

PostgreSQL 9.3 – Logging Questions

Related Question