PostgreSQL MVCC – Visible Tuple with xmin > Current txid

mvccpostgresql

I'm trying to better understand MVCC and am confused by the following (contrived) scenario. I was under the impression that for a row to be visible to a transaction, the transaction's id must be greater than the tuple's xmin value.

create table test (id int);

(session 1)
begin;
select txid_current();
┌──────────────┐
│ txid_current │
├──────────────┤
│          704 │
└──────────────┘

(session 2)
begin;
insert into test values (1);
commit;

(session 1)
select xmin, xmax, id from test;
┌──────┬──────┬────┐
│ xmin │ xmax │ id │
├──────┼──────┼────┤
│  705 │    0 │  1 │
└──────┴──────┴────┘

How can a transaction with an id of 704 see a tuple with an xmin of 705?

Update:

Reading http://www.interdb.jp/pg/pgsql05.html helped clarify things a lot more for me. A few videos I had previously watched seemed to imply that, as a hard rule, a tuple is only visible if its xmin is less than the current transaction's txid.

However, the above reading painted a slightly different picture for me. In an isolation level of READ COMMITTED, a transaction gets a new snapshot of the state of other transactions at the start of each query.

The first query in session 1 would get a snapshot of 704:704: (you can see this by running select txid_current_snapshot()) which means that from the point of view of txid 704, all txids < 704 are inactive (committed or aborted) and all transactions >= 704 are active (in progress or not yet started).

The second query of session 1 would get a new snapshot of 704:706: which means that from the point of view of txid 704, all txids < 704 are inactive, and all txids >= 706 are active. However, since txid 705 fits in between these values, and since it has in fact been committed, it too is part of the new snapshot that txid 704 received, and is visible. Therefore, it is not a hard rule that a tuple is only visible if its xmin is less than the current transaction's txid.

Best Answer

This depends on the isolation level. The default level for postgresql is "Read Committed" at which level this can happen. In postgresql it won't happen at "Repeatable Read" or "Serializable".

This behavior is often not a problem and higher serialization levels can slow down the server which is the reason why they are not the default.

For an extensive explanation please see the manual https://www.postgresql.org/docs/9.6/static/transaction-iso.html