PostgreSQL – How to Get Transaction Commit Timestamp

postgresqltimestamp

I have data-pulling functionality that once in 5 seconds grabs all the data from Postgres table basing on modified_timestamp column. It works the following way:

  1. SELECT * FROM my_table WHERE modified_timestamp > _some_persisted_timestamp
  2. _some_persisted_timestamp = CURRENT_TIMESTAMP
  3. Processing data received from step 1
  4. Sleep for 5s
  5. Go to step 1

Where modified_timestamp updated with trigger (after any row update modified_timestamp becomes CURRENT_TIMESTAMP).
It worked fine until I noticed that CURRENT_TIMESTAMP in Postgres is transaction start timestamp in fact and some of the updates are lost. Why are they lost? That's quite simple – at the moment when I execute query SELECT * FROM my_table WHERE modified_timestamp > _some_persisted_timestamp some of the changes have already occurred, but modified_timestamp is before updated _some_persisted_timestamp because transaction is in progress still.

This problem could be easily solved by assigning in step 2 timestamp when update becomes visible for other transactions (transaction commit timestamp in other words) instead of CURRENT_TIMESTAMP or clock_timestamp().

I read documentation, but have found nothing related to transaction commit timestamp. Could you kindly suggest smth?

Btw, I'm aware of logical decoding and I know that this mechanism suits better for my needs in theory, but there are certain practical problems not allowing me to use it.

Best Answer

This problem could be easily solved by assigning in step 2 timestamp when update becomes visible for other transactions (transaction commit timestamp in other words) instead of CURRENT_TIMESTAMP or clock_timestamp().

This is logically impossible. Postgres writes new row versions before it finally commits to make them visible. It would require prophetic capabilities to write a future timestamp yet unknown at the time of writing.

However, you can get commit timestamps from a different source: since Postgres 9.5, there is a GUC setting track_commit_timestamp to start logging commit timestamps globally.

Then you can get commit timestamps with the utility function pg_xact_commit_timestamp(xid). Your query could look like:

SELECT * FROM my_table t
WHERE   pg_xact_commit_timestamp(t.xmin) > _some_persisted_timestamp;

Be aware that commit timestamps are not kept around forever. After two billion transactions transactions (2^31), transaction IDs are "frozen". That does not delete it right away, but after 4 billion transactions, the information is gone for certain. That's a big number of transactions, and only very busy databases burn that much over a lifetime. But there can be programming errors burning through transaction numbers more quickly than expected ...

Your step 2 and step 3 trade positions, and you record the commit timestamp instead of CURRENT_TIMESTAMP - or xmin from any freshly updated row to derive the commit timestamp with pg_xact_commit_timestamp() once more.

More:

About xmin:

But I am not completely sure I understand your task. Maybe you need a queuing tool or process rows one by one like discussed here: