PostgreSQL – Serializability Ordering After Transaction Completion

postgresql

Suppose, I have a table product with integer column price.

Here are two transactions that I begin at the same time (almost):

T1:

BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;

UPDATE "product"
SET price = 13;

T2:

BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;

insert into "product" (id, price) values ('three', 10);

Now suppose I commit T2 first. And then T1.

My result after both transactions finish is:

| id  |  price
---------------
| one | 13   |
| two | 13   |
| three| 10  |

This is perplexing to me as I would have expected the serialized order to be T2 -> T1 (since T2 finished much befor) but the result set implies the serialized order is T1 -> T2.

Also, I might also have been okay with T1 getting aborted because of T2 committing.

Best Answer

The serializable isolation says that the result of concurrent transactions has to be consistent with one serial order, no matter which one.

Here there are two possible serial (sequential) orders, T1 followed by T2 and T2 followed by T1.

The resut of T1 followed by T2 would be:

| id  |  price
---------------
| one | 13   |
| two | 13   |
| three| 10  |

It happens to be exactly the result you got in the concurrent execution of T1 and T2.

This is why there is no reason for the engine to raise a serialization exception. This particular concurrent execution meets the condition of success of the serializable isolation level.

I would have expected the serialized order to be T2 -> T1 (since T2 finished much befor) but the result set implies the serialized order is T1 -> T2.

Here you are confused about how it works. The isolation level does not influence the order of execution. It merely detects serialization anomalies to raise an error when they happen (and also in fact, when they might happen, because false positives are possible).

Specific issues with the query

PostgreSQL defaults to READ COMMITTED isolation, and it looks like you're not using anything different. In READ COMMITTED each statement gets its own snapshot. It cannot see uncommitted changes from other transactions; it's as if they simply haven't happened.

Now, imagine you run this simultaneously in three sessions, on a setup with three entries in jobs, zero in jobs_running and zero in jobs_completed:

insert into piper.jobs_running
select x.fid from ( 
  SELECT fid FROM piper.jobs
  except 
  select fid from piper.jobs_running
  except 
  select fid from piper.jobs_completed
 ) as x limit 1
returning(fid)

Each run will select the three jobs in jobs. Because their snapshots were all taken before any of them committed a change, or even created the uncommitted row, *all of them will find zero rows in jobs_running and jobs_completed.

So they all claim a job. Probably the same job, because even though there's no ORDER BY, the scan order will be the same.

Locking

Row locking crosses transnational boundaries, and lets you communicate between transactions to enforce ordering. So you might think this would solve your problem, but it won't.

If you take a FOR UPDATE lock on the row entry, so the row is locked exclusively, the lock is held until the transaction commits or rolls back. So you'd think that then the next transaction would get a different row, or if it tried to get the same row it'd wait for the lock to release, see that there was now an entry in jobs_running, and skip the row. You'd be wrong.

What'll happen is that you'll lock all rows. One transaction will successfully get the locks on all rows. The other transactions, which will do the same index or sequential scan, will generally try to lock the rows in the same order, get stuck on the first lock attempt, and wait for the first transaction to roll back or commit. If you're unlucky, they might start locking different sets of rows and deadlock against each other instead, causing a deadlock abort, but usually you're just getting no useful concurrency.

Worse though, the first transaction picks one of the rows it locked, inserts a row into jobs_running and commits, releasing the locks. Another transaction is then able to continue, and locks all the rows.... but it doesn't get a new snapshot of the database state (the snapshot gets taken at statement start), so *it can't see that you inserted a row into jobs_running. Thus, it grabs the same job, inserts a row for that job into jobs_running, and commits.

Condition re-checks

PostgreSQL has a quirky feature not included in most databases where, if a transaction blocks on a lock, it re-checks that the selected row still matches the lock condition when it gets the lock after the first transaction commits.

That's why the example in https://stackoverflow.com/questions/11532550/atomic-update-select-in-postgres works - it relies on qualifier re-checks in WHERE clauses after reclaiming a lock.

The use of locking forces everything to run serially, so in practice you might as well have a single connection doing the work.

Isolation, ACID, and reality

Transaction isolation in PostgreSQL is not the perfect ideal where transactions run concurrently, but their effects are the same as if they were executed serially.

The only real world database that would deliver perfect isolation would be one that exclusively locks every table when a write transaction first accesses it, so in practice transactions can only be concurrent access if it's to different parts of the DB. Nobody wants to use such a database in situations where concurrency is desirable or useful.

All real world implementations are compromises.

`READ COMMITTED` default

PostgreSQL's default is READ COMMITTED isolation, a well-defined isolation level that permits non-repeatable and phantom reads, as discussed in the PostgreSQL manual on transaction isolation.

`SERIALIZABLE` isolation

You can request stricter SERIALIZABLE isolation on a per-transaction basis or as a per-user, per-database or (not recommended) global default. This provides much stronger guarantees, though still not perfect, at the cost of forcibly rolling back transactions if they interact in ways that couldn't happen if they had run serially.

Because your concurrent queries will always try to grab the first job, no matter how many jobs there are, all but one will abort with serialization failures. So in practice you don't get any useful concurrency and might as well have a single connection handing out jobs to workers.

(Note that prior to PostgreSQL 9.1 SERIALIZABLE offered much weaker guarantees and would not detect quite a lot of cases where transactions were interdependent.)

Auto-re-run `SERIALIZABLE`

PostgreSQL does not auto-re-run SERIALIZABLE transactions that abort due to serialization failures. There are cases where this would be quite useful, but other cases where doing so would be completely wrong and dangerous - especially where read/modify/write cycles via the application are involved. At this time there is no support for automatically rerunning transactions on serialization failures. The application is expected to retry.

Don't write a DIY queuing system

It looks like what you're trying to do is write a queuing system. Consider not doing that. Writing a robust, reliable and correct queueing system is hard, and there are a few good ones already available that you could adopt. You have to handle things like crash-safety, what happens when someone takes a task then fails to complete it, race conditions around completion just as you give up on the task handler completing it, etc. There are lots of subtle concurrency problems. Don't try to DIY.

9.5 and `SKIP LOCKED`

PostgreSQL 9.5, still in development, adds a feature that makes queuing quite a bit easier.

It lets you say that if when you SELECT ... FOR UPDATE, if a row is locked, you should ignore it and carry on to find the next not-locked row. This is very useful when combined with a LIMIT as it makes it possible to say "find me the first row someone else isn't already trying to claim". So it becomes very simple to write queuing that grabs jobs safely and concurrently.

Until that feature is available I strongly suggest that you stick to single connection queue management, or use a well-tested task queue system.

Best Answer

Related Solutions

Mysql – Disable explicit commits in JDBC, detect them in SQL, or put the database in a readonly state

PostgreSQL 9.1 – Handling ACID Violations

Specific issues with the query

Locking

Condition re-checks

Isolation, ACID, and reality

READ COMMITTED default

SERIALIZABLE isolation

Auto-re-run SERIALIZABLE

Don't write a DIY queuing system

9.5 and SKIP LOCKED

Related Question

`READ COMMITTED` default

`SERIALIZABLE` isolation

Auto-re-run `SERIALIZABLE`

9.5 and `SKIP LOCKED`