Postgresql – Optimizing PostgreSQL hstore for lots of small updates

hstorepostgresql

I have a situation where I will have one web session per row, but this session will generate dozens of updates to an hstore field in that row, one k/v at a time. At the end, I will have a completed structure. The keys will be more or less the same per session, but will evolve over time. The values will be mixed, with some being unique per session (e.g. email) and some having lots of repetition (e.g. male/female).

I am running under the assumption that lock contention will only be present at the row level, which is fine. What other gotchas should I look out for? I have zero experience with hstore, so I really want to make sure I understand what I am in for. Any feedback is appreciated.

Best Answer

With lots of UPDATEs, each one, no matter how small, will cause the entire row to be rewritten into a new version, as a consequence of the MVCC mechanism.
Then the old version of the row will be picked up at some point by autovacuum when it's certain that no transaction may need it, and its space will be flagged as reusable .
The constant turnover of disk space for old and new rows leads to fragmentation, especially if the rows are large in size.
Additionally, the whole set of operations are logged into the WAL files (unless the table is unlogged).

For these reasons, high frequency UPDATEs of large columns is a worst case scenario for postgres.

So, if that session data doesn't really require durable storage in the first place, a mixed disk-memory specialized key/store engine like Redis is likely to perform way better.

Otherwise, this other question: What fillfactor for caching table? has good information and advice on how to mitigate the difficulties.