I am using a Postgres DB to implement a scheduling of jobs for a large number of computers/processes. To make the story short, every job has its id, all scheduling is implemented with three tabes: all jobs, currently running jobs, and already completed jobs.
The key functionality of the scheduling is to (1) request a job and (2) inform DB about a completed job. Requesting a job takes any id from the job list, which is not in the running table and not in the completed table:
insert into piper.jobs_running
select x.fid from (
SELECT fid FROM piper.jobs
except
select fid from piper.jobs_running
except
select fid from piper.jobs_completed
) as x limit 1
returning(fid)
Completing the job removes it from the running list and inserts it into the completed list. Since it is not concurrency-specific I ommit the SQL commands (it takes something from tens of minutes to a couple of hours to finish a job.)
It came as a nasty surpirse to me, that two processes runing exactly the same query above (pretty much requesting the job at the same time) may get the same job id (fid). The only possible explanation that I am comming up is that Postgres is not rely ACID conform. Comments?
Additional information: I set the transactions to be serializable (in postgresql.conf set default_transaction_isolation = 'serializable'
). Now the DBMS fails the transactions in case the isolation is not fullfilled. is it possible to force Postgres to restart them automatically?
Best Answer
Specific issues with the query
PostgreSQL defaults to
READ COMMITTED
isolation, and it looks like you're not using anything different. InREAD COMMITTED
each statement gets its own snapshot. It cannot see uncommitted changes from other transactions; it's as if they simply haven't happened.Now, imagine you run this simultaneously in three sessions, on a setup with three entries in
jobs
, zero injobs_running
and zero injobs_completed
:Each run will select the three jobs in
jobs
. Because their snapshots were all taken before any of them committed a change, or even created the uncommitted row, *all of them will find zero rows injobs_running
andjobs_completed
.So they all claim a job. Probably the same job, because even though there's no
ORDER BY
, the scan order will be the same.Locking
Row locking crosses transnational boundaries, and lets you communicate between transactions to enforce ordering. So you might think this would solve your problem, but it won't.
If you take a
FOR UPDATE
lock on therow
entry, so the row is locked exclusively, the lock is held until the transaction commits or rolls back. So you'd think that then the next transaction would get a different row, or if it tried to get the same row it'd wait for the lock to release, see that there was now an entry injobs_running
, and skip the row. You'd be wrong.What'll happen is that you'll lock all rows. One transaction will successfully get the locks on all rows. The other transactions, which will do the same index or sequential scan, will generally try to lock the rows in the same order, get stuck on the first lock attempt, and wait for the first transaction to roll back or commit. If you're unlucky, they might start locking different sets of rows and deadlock against each other instead, causing a deadlock abort, but usually you're just getting no useful concurrency.
Worse though, the first transaction picks one of the rows it locked, inserts a row into
jobs_running
and commits, releasing the locks. Another transaction is then able to continue, and locks all the rows.... but it doesn't get a new snapshot of the database state (the snapshot gets taken at statement start), so *it can't see that you inserted a row intojobs_running
. Thus, it grabs the same job, inserts a row for that job intojobs_running
, and commits.Condition re-checks
PostgreSQL has a quirky feature not included in most databases where, if a transaction blocks on a lock, it re-checks that the selected row still matches the lock condition when it gets the lock after the first transaction commits.
That's why the example in https://stackoverflow.com/questions/11532550/atomic-update-select-in-postgres works - it relies on qualifier re-checks in
WHERE
clauses after reclaiming a lock.The use of locking forces everything to run serially, so in practice you might as well have a single connection doing the work.
Isolation, ACID, and reality
Transaction isolation in PostgreSQL is not the perfect ideal where transactions run concurrently, but their effects are the same as if they were executed serially.
The only real world database that would deliver perfect isolation would be one that exclusively locks every table when a write transaction first accesses it, so in practice transactions can only be concurrent access if it's to different parts of the DB. Nobody wants to use such a database in situations where concurrency is desirable or useful.
All real world implementations are compromises.
READ COMMITTED
defaultPostgreSQL's default is
READ COMMITTED
isolation, a well-defined isolation level that permits non-repeatable and phantom reads, as discussed in the PostgreSQL manual on transaction isolation.SERIALIZABLE
isolationYou can request stricter
SERIALIZABLE
isolation on a per-transaction basis or as a per-user, per-database or (not recommended) global default. This provides much stronger guarantees, though still not perfect, at the cost of forcibly rolling back transactions if they interact in ways that couldn't happen if they had run serially.Because your concurrent queries will always try to grab the first job, no matter how many jobs there are, all but one will abort with serialization failures. So in practice you don't get any useful concurrency and might as well have a single connection handing out jobs to workers.
(Note that prior to PostgreSQL 9.1
SERIALIZABLE
offered much weaker guarantees and would not detect quite a lot of cases where transactions were interdependent.)Auto-re-run
SERIALIZABLE
PostgreSQL does not auto-re-run
SERIALIZABLE
transactions that abort due to serialization failures. There are cases where this would be quite useful, but other cases where doing so would be completely wrong and dangerous - especially where read/modify/write cycles via the application are involved. At this time there is no support for automatically rerunning transactions on serialization failures. The application is expected to retry.Don't write a DIY queuing system
It looks like what you're trying to do is write a queuing system. Consider not doing that. Writing a robust, reliable and correct queueing system is hard, and there are a few good ones already available that you could adopt. You have to handle things like crash-safety, what happens when someone takes a task then fails to complete it, race conditions around completion just as you give up on the task handler completing it, etc. There are lots of subtle concurrency problems. Don't try to DIY.
9.5 and
SKIP LOCKED
PostgreSQL 9.5, still in development, adds a feature that makes queuing quite a bit easier.
It lets you say that if when you
SELECT ... FOR UPDATE
, if a row is locked, you should ignore it and carry on to find the next not-locked row. This is very useful when combined with aLIMIT
as it makes it possible to say "find me the first row someone else isn't already trying to claim". So it becomes very simple to write queuing that grabs jobs safely and concurrently.Until that feature is available I strongly suggest that you stick to single connection queue management, or use a well-tested task queue system.