Postgresql – Optimistic locking with postgresql blocking due to locks

lockingpostgresql

I'm trying to use optimistic lock in postgresql but it seems I'm misunderstanding how it should work. I thought that if I used 'serializable' isolation level, each transaction would act as if the other transactions didn't exist and just at compile time any checking would be done and the transaction would eventually abort. However, in the test I'm doing, the transactions are affecting others, in the sense that one transaction might block in a lock. This is the test I'm doing:

First, create the db:

CREATE USER myuser WITH PASSWORD '1234';
CREATE DATABASE tempdb;
GRANT ALL PRIVILEGES ON DATABASE tempdb to myuser;
\connect tempdb;
CREATE TABLE temptable (
    id        integer PRIMARY KEY,
    name       varchar(40) NOT NULL
);
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO myuser;

Then, create two python programs.

The first one will try to insert a row and then will sleep 10 seconds before committing the transaction:

import time
import psycopg2

def main():
    conn = psycopg2.connect(dbname='tempdb', host='localhost', user='myuser', password='1234')
    cur = conn.cursor()
    cur.execute("begin transaction isolation level serializable")
    cur.execute("insert into temptable (id, name) values (1, 'my name');")
    time.sleep(10)
    cur.execute("commit")
    print('test finished')

if __name__ == '__main__':
    main()

The second will try to do almost the same, however it won't sleep for 10 seconds:

import time
import psycopg2

def main():
    conn = psycopg2.connect(dbname='tempdb', host='localhost', user='myuser', password='1234')
    cur = conn.cursor()
    cur.execute("begin transaction isolation level serializable")
    cur.execute("insert into temptable (id, name) values (1, 'my other name');")
    cur.execute("commit")
    print('test finished')

if __name__ == '__main__':
    main()

If I run temp1.py and then, in other window, temp2.py, temp2.py will hang until temp1.py commits. But, what I would expect is that temp2.py would insert the row, commit and then, when temp1.py tries to commit, it would get an error.

Am I doing anything wrong? Is it possible to do what I want in postgresql?

(I'm using postgresql 10.12)

Best Answer

Q: However, in the test I'm doing, the transactions are affecting others, in the sense that one transaction might block in a lock.

Concurrent transactions not affecting each other at the serializable isolation level does not mean that any one transaction does not lock any other. Locks are inevitable in databases as soon as concurrent transactions work on the same data.

Not affecting each other means that the result of any particular transaction does not differ from what it would be if other transactions were not working concurrently with the same data.

The time it takes to complete a transaction, or whether it had to wait for locks to be released do not count as results. Results are rows and values that are read or written or returned to the caller.

Q: But, what I would expect is that temp2.py would insert the row, commit and then, when temp1.py tries to commit, it would get an error.

The error concerns a violation of a unique constraint. Constraint checks can be deferred or immediate, independently of the isolation level, and they're immediate by default.

If the unique column was declared as id integer PRIMARY KEY DEFERRABLE INITIALLY DEFERRED, the INSERT in transaction #2 would not hang, but transaction #2 would hang at COMMIT time until transaction #1 commits or aborts.

Q: Is it possible to do what I want in postgresql?

If you want the second transaction (the one that inserts in second) to be guaranteed to succeed because it's faster to commit than the other one that inserted first, I don't think it's possible. From the point of view of the serializable isolation logic, it doesn't really matter which transaction fails. Any transaction may fail and the caller must always be prepared to react appropriately according to SQLSTATE (the error code).

Related Solutions

Mysql – How to correctly implement optimistic locking in MySQL

Your developer is mistaken. You need either SELECT ... FOR UPDATE or row versioning, not both.

Try it and see. Open three MySQL sessions (A), (B) and (C) to the same database.

In (C) issue:

CREATE TABLE test(
    id integer PRIMARY KEY,
    data varchar(255) not null,
    version integer not null
);
INSERT INTO test(id,data,version) VALUES (1,'fred',0);
BEGIN;
LOCK TABLES test WRITE;

In both (A) and (B) issue an UPDATE that tests and sets the row version, changing the winner text in each so you can see which session is which:

-- In (A):

BEGIN;
UPDATE test SET data = 'winnerA',
            version = version + 1
WHERE id = 1 AND version = 0;

-- in (B):

BEGIN;
UPDATE test SET data = 'winnerB',
            version = version + 1
WHERE id = 1 AND version = 0;

Now in (C), UNLOCK TABLES; to release the lock.

(A) and (B) will race for the row lock. One of them will win and get the lock. The other will block on the lock. The winner who got the lock will proceed to change the row. Assuming (A) is the winner, you can now see the changed row (still uncommitted so not visible to other transactions) with a SELECT * FROM test WHERE id = 1.

Now COMMIT in the winner session, say (A).

(B) will get the lock and proceed with the update. However, the version no longer matches, so it will change no rows, as reported by the row count result. Only one UPDATE had any effect, and the client application can clearly see which UPDATE succeeded and which failed. No further locking is necessary.

See session logs at pastebin here. I used mysql --prompt="A> " etc to make it easy to tell the difference between sessions. I copied and pasted the output interleaved in time sequence, so it's not totally raw output and it's possible I could've made errors copying and pasting it. Test it yourself to see.

If you had not added a row version field, then you would need to SELECT ... FOR UPDATE to be able to reliably ensure ordering.

If you think about it, a SELECT ... FOR UPDATE is completely redundant if you're immediately doing an UPDATE without re-using data from the SELECT, or if you're using row versioning. The UPDATE will take a lock anyway. If someone else updates the row between your read and subsequent write, your version won't match anymore so your update will fail. That's how optimistic locking works.

The purpose of SELECT ... FOR UPDATE is:

To manage lock ordering to avoid deadlocks; and
To extend the span of a row lock for when you want to read data from a row, change it in the application, and write a new row that's based on the original one without having to use SERIALIZABLE isolation or row versioning.

You do not need to use both optimistic locking (row versioning) and SELECT ... FOR UPDATE. Use one or the other.

How does optimistic locking actually enforce re-read/update

The basic technique is quite straightforward. When you read the record you take a note of the version or timestamp column, e.g.

Select FooID
      ,Foo
      ,Bar
      ,TS      -- timestamp
  from Foobar
 where FooID = @FooID

When you go to write out the record you filter the write by the timestamp/version so that the write writes nothing if the timestamp/version has changed. This makes the write atomic, e.g.

update Foo
   set Foo = @foo
      ,Bar = @bar
      ,TS = @timestamp  
 where FooID = @FooID
   and ts = @timestamp

select @row_count = @@rowcount  -- specific to t-sql, but this is a system variable
                                -- that holds the number of rows affected by the
if @@rowcount = 0               -- most recent operation.  Other DBMS platforms do
    [deal with outdated record] -- this differently.

This allows an application to do the update without holding locks open. This is necessary for n-tier systems working through a connection pool, and prevents a class of deadlocks that used to be common on two-tier client-server systems.

There is nothing enforced in the database about this. It's all done explicitly by the application.

Best Answer

Related Solutions

Mysql – How to correctly implement optimistic locking in MySQL

How does optimistic locking actually enforce re-read/update

Related Question