Postgresql – Is it possible to use txid_current value for sync

data synchronizationpostgresql

I need to expose my data through Rest API for a mobile offline experience.
The idea is basically that a mobile client downloads all data to his device. after a week he wants to download only the changes into his device. each mobile client synced to a different point of time.
My problem is how to create a deterministic cursor for the client which they will send it on their next sync request and I'll be able to figure out the changes since their last sync.

Assumptions:

  1. Clients tolerate data duplication.
  2. My database is insert-update only (never delete).

Possible solutions:

  1. Logical Decoding
    It's relatively (to other solutions we have in mind) hard to create a plugin and it doesn't fit our needs because our permission model, each client needs to get only rows he is permitted to see.
  2. txid_current + cmin value.
    Save the transaction id on each row to create incremental value, cmin is needed to create a more precise cursor (many rows may have the same txid, but only one row has a specific txid + cmin pair).
  3. Use xmin only, and re-sync (sync from scratch) user's upon wraparound. (according to the epoch counter).
  4. Add SERIAL column/sequence, and bump it up manually or with triggers, it won't make us have to serialize our transaction if we are using Overlapping Sync as describe below.
  5. Serialize all transactions that act on the most top resource in my model (database), which is least preferred because of the performance and complexity it introduces to the application code.

Currently, I really want to use option 2, because it seems like the simplest solution.

How I'm going to handle Consistent Ordering:

The problem:
If the client syncs up to the latest visible (committed) change there may be active transactions that include earlier changes that are not yet visible.

Consistent Ordering problem

The solution:
Overlapping Sync the clients, Sync up to the latest visible change but keep track of what the latest safe change was. On the next sync, ask for changes since the lastest safe change rather than the latest visible change.


Is option 2 valid? (under the assumption above and the Overlapping Sync solution), if not, what is the next simplest solution? there is another solution which I didn't think about?

Best Answer

I would tag each inserted row with the current transaction ID (txid_current() or, from v13 on, pg_current_xact_id()).

The “latest safe change” is the right before the xmin of the “current snapshot”, which can be found with txid_snapshot_xmin(txid_current_snapshot()) (or pg_snapshot_xmin(pg_current_snapshot()) from v13 on). That xmin is the transaction ID of the oldest still active concurrent transaction, that is, everything older than that is done and visible.

This should make it fairly easy to build a replication system.