PostgreSQL – How to Configure Read Transactions to Avoid Write Serialization Failures

acidpostgresqlrollbackserialization

I am protecting critical transactions with application mutexes and SERIALIZABLE.

I presume that this makes these operations as acid as possible, but now I'm concerned that reads might cause failures to those write transactions because of locking.

The reads are in libpqxx prepared statement transactions. I have read the docs, wiki, and white paper, but I can't find out how to configure a read transaction so that it never risks a failure of another critical serializable write transactions.

I cannot determine if locks will cause serialization failures. Can they?

The reads do not need to be perfectly acid because the application compensates for race conditions, old data, etc. The primary concern is that the serializable critical write transactions do not fail due to other reads.

How should a read only prepared statement be configured so absolutely not to risk another serializable transaction write failure?

Best Answer

A READ ONLY transaction cannot cause a write transaction (that does not perform DDL) to fail, unless it explicitly uses LOCK TABLE or advisory locks.

READ ONLY transactions cannot SELECT ... FOR SHARE or SELECT ... FOR UPDATE. As they can't do DML, the strongest lock they can take on a table is ACCESS SHARE, which conflicts only with the ACCESS EXCLUSIVE lock taken by DDL.

Nor can a read-only transaction cause serialization failures if the write transaction is SERIALIZABLE, because serialization failures require that both transactions perform writes. It is always possible to logically serialize a read only transaction either before or after a read/write transaction, as it is impossible for them to be mutually interdependent.

So: It should be fine to use READ COMMITTED or SERIALIZABLE transactions, with READ ONLY, so long as you do not explicitly LOCK TABLE.

You also need to make sure you don't use advisory locks that might interact between the two sets of transactions. Most likely you don't use advisory locks at all and can forget about this entirely.

Separately, though, an application must be prepared to deal with serialization failures or other transaction aborts. Any design that tries to avoid this is broken. Transactions can be aborted because of OS / host machine level issues, admin action, etc. Do not rely on transactions that "cannot fail". If you absolutely must do this, you need to use two-phase commit, where you PREPARE TRANSACTION (at which point it's guaranteed that the tx cannot fail to commit), do the other work that relies on the tx committing safely, then COMMIT PREPARED. If something goes wrong with the other work, you can ROLLBACK PREPARED. 2PC has significant overheads and is best avoided when you can, but sometimes there's just no choice.

Related Solutions

PostgreSQL Performance – Moving High-Volume Tables to Separate Database

You will gain a lot more with regards to performance if you move the table to a different harddisk. As long as the "busy" table and the rest are located on the same disk, moving that table into a different "file" (by moving it into a different database) won't change anything with regards to (I/O) performance.

Distributing the I/O load to a different harddisk (and a different harddisk controller) will most probably give you a better performance for the remaining data because it is not affected by the I/O done on the busy table.

To move the table to a different harddisk, you need to create a new tablespace (which of course is located on that disk) and then move that table to that tablespace.

Postgresql – How to write validation trigger which works with all isolation levels

You can solve this for any isolation level by taking an exclusive lock on a row in a parent table - which is much cheaper and less intrusive than locking the whole table like you do currently.

If you don't have one already (which is a typical case), just create a simple table with some kind of UNIQUE constraint, a PK serves nicely:

CREATE TABLE parent (val integer PRIMARY KEY);  -- match data type

Achieve the lock with an UPSERT statement (as first statement in the transaction!) which either inserts a row in the parent table (holding an exclusive lock automatically) or takes a lock on an existing row. This way, only a single transaction at a time can write to rows with the same NEW.val in the child table. And all such rows are visible to the current transaction (even in REPEATABLE READ) because it started with the lock.

You do not even need a foreign key constraint between child and parent (though it won't hurt). You can just start with an empty parent table.

Waring: not reliable as trigger in REPEATABLE READ mode! See below.

CREATE OR REPLACE FUNCTION verify()
  RETURNS trigger AS
$func$
BEGIN   
   INSERT INTO parent AS p (val)    -- insert parent
   VALUES (NEW.val)
   ON     CONFLICT (val) DO UPDATE  -- if already there ...
   SET    val = NEW.val
   WHERE  FALSE;                    -- never executed, just locks the row

   IF (SELECT count(*) >= 5 FROM child c WHERE c.val = NEW.val) THEN
      RAISE EXCEPTION '% already present 5 times', NEW.val USING ERRCODE = 'integrity_constraint_violation';
   ELSE
      RETURN NEW;                   -- proceed with INSERT / UPDATE
   END IF;
END
$func$  LANGUAGE plpgsql;

The manual:

all rows will be locked when the ON CONFLICT DO UPDATE action is taken.

When inserting / updating multiple rows, make sure the rows are in consistent order to avoid deadlocks.

On second thought, while cheaper than your current trigger function, this can still fail because a trigger function is not necessarily the first statement in the transaction. Concurrent transactions may have committed between transaction start and the time the lock is taken. And any such rows in the child table remain invisible in REPEATABLE READ mode, making the count fail.

You would have to encapsulate each write on one or more rows with the same val in a separate transaction, starting with the UPSERT to take the lock first.

Alternative: add a number column and a `UNIQUE` constraint

Untested, no more time ...

ALTER TABLE child
   ADD COLUMN val_nr int NOT NULL
 , ADD CONSTRAINT child_max5_per_val CHECK (val_nr BETWEEN 1 AND 5)
 , ADD CONSTRAINT child_val_nr UNIQUE (val, val_vr);

These constraints enforce your rules in all transaction isolation modes. But you need to assign a val_nr between 1 and 5 for every row. After taking the lock, you can look up existing val_nr and fill gaps, which can still fail in REPEATABLE READ mode, but not silently. You'd get a duplicate key violation error.

Best Answer

Related Solutions

PostgreSQL Performance – Moving High-Volume Tables to Separate Database

Postgresql – How to write validation trigger which works with all isolation levels

Alternative: add a number column and a UNIQUE constraint

Related Question

Alternative: add a number column and a `UNIQUE` constraint