We are considering pessimistic locking for our project (SELECT ... FOR UPDATE
); as we are used to optimistic locking, we are afraid about the operation of the server being blocked on such a lock (lock not getting released ever).
The most common concern that people have is obviously deadlocks, however we have another concern: what if the client dies, or looses connection to the server, while holding a SELECT ... FOR UPDATE
lock?
In many cases it would not be relevant because if the client dies the application dies, however in this case we will have a cluster of servers with failover, so we wouldn't want other nodes in the cluster to be overly affected if one node dies.
On this topic I have found this 2003 message, concerning postgreSQL:
http://www.postgresql.org/message-id/3E81D7F0.2010604@xythos.com
We have also tested the tcp_keepalives_idle
postgreSQL configuration parameter, and confirmed that if we leave it low enough and kill the client, the lock gets released. However if we didn't touch that parameter, the lock did not get released (but maybe we didn't wait long enough).
And that's without considering the concern raised by this 2003 message, about the difference between the lifecycle of the client application and the life-cycle of the connection pool (we'll keep the default on this but we expect each node to have its own connection pool, no sharing across the cluster).
The problem gets even more complicated because we intend to support several database servers (at least Microsoft SQL Server and PostgreSQL, possibly Oracle).
Our solution would use JavaEE and Glassfish as an application server.
So to summarize the question: Is it safe to assume that apart from deadlocks, it is not possible that a lock stays held forever, the lock will be released after a reasonably short time (which range of duration?), even in case of a sudden death or loss of connectivity to a client holding a lock?
Best Answer
No, I don't think it's safe to assume locks from dead/vanished clients are released in a bounded and deterministic amount of time with all DBMSes and drivers. You'll need to investigate each configuration separately.
In the case of PostgreSQL you're generally but not always OK if you have TCP keepalives set quite aggressively, because:
RST
the TCP connection as part of process cleanup;However, there are a few cases that will not be handled:
DISCARD ALL
and thus fail to release and reset session-level resources like advisory locks (if you use them);SIGSTOP
ped, having been paused by a debugger, hitting an internal threading deadlock, etc. The OS will keep on responding to tcp keepalives but the app won't respond to Pg protocol messages or advance its work.In the case of PostgreSQL you can use active lock monitoring to scan for and terminate long running transactions that haven't done anything in a while. In particular, you can deal with
<IDLE> in transaction
sessions by scanningpg_stat_activity
(though it's only possible to do this RELIABLY and EASILY in 9.2). With a bit more effort you can usepg_locks
to watch for queries blocked on a lock for more than x seconds and kill the session holding the lock, though this can make it hard to run some DDL like index creation.What you really need is application level keepalives, where the app says "Yup, I'm alive and responsive". These are rather harder to implement, though.
One thing that will help is that both PgBouncer and PgPool-II (external connection pools for PostgreSQL) support controls for session and transaction timeouts. We've wanted to implement similar options in the core PostgreSQL for some time, but nobody's come up with a design that's robust enough to handle all the corner cases, so for now your best bet is to use an external pooler. You can do this even if you're also using an application-level connection pool.
On the good news front, PostgreSQL automatically detects and breaks deadlocks between transactions, so one thing you don't have to worry about much is deadlocks at the SQL level when using PostgreSQL.