Say that you want to build an append only table in Postgres, where each modification is actually just a new row added to the table. Say that you have the below simple, and disregard all questions about performance, table of users.
create table user_revisions (id uuid, name text, password text, version bigserial);
insert into user_revisions (id, name, password)
values ('743ccdf0-e9d8-4268-b6d7-0645eb70feb9', 'Bengan', 'nthusaeo');
insert into user_revisions (id, name, password)
values ('743ccdf0-e9d8-4268-b6d7-0645eb70feb9', 'Bengan', 'åäö');
Getting the latest version of a user would look something like:
select * from user_revisions e1
where e1.id='743ccdf0-e9d8-4268-b6d7-0645eb70feb9'
order by version desc limit 1;
But then you come to the part where you want to start modifying the users. Each request to the database always has the uuid
of the user being modified and some subset of all the fields that are otherwise present in the user_revisions
table.
You want to make an insert with the last revision of the user as a set of default values? That is, do the equivalent of the Clojure code (merge a b)
or Python {**a, **b}
, where the two associative data structures a and b are merged and if there are any keys in a that also exist in b the values held in b will override those of a.
Below are some failed attempts at writing a select that produces the output that we in turn wants to insert as the latest revision.
with temp (password) as (values ('snthsnth'))
select * from user_revisions e1 natural left join temp
where e1.id='743ccdf0-e9d8-4268-b6d7-0645eb70feb9';
with temp (password) as (values ('snthsnth'))
select * from user_revisions e1 natural inner join temp
where e1.id='743ccdf0-e9d8-4268-b6d7-0645eb70feb9';
with temp (password) as (values ('snthsnth'))
select * from user_revisions e1 natural right join temp
where e1.id='743ccdf0-e9d8-4268-b6d7-0645eb70feb9';
The desired output would be:
password | id | name
-----------+--------------------------------------+--------
snthsnth | 743ccdf0-e9d8-4268-b6d7-0645eb70feb9 | Bengan
This output would then be inserted into the table as the latest version of the user with the id 743ccdf0-e9d8-4268-b6d7-0645eb70feb9
.
Best Answer
It depends if you need something generic or if it can work with a "small" number of columns.
In your specific example, this query:
would produce the expected:
You can test things in this SQLFiddle: http://sqlfiddle.com/#!17/c4700/5/0
I am quite confident it could be made generic if needed, but it may be enough already for your needs?
However it will not work correctly if you need to store NULL values, and if you need to handle transition such as "some data value => NULL"
In such cases there would be a specific need to realy test the existence of each field and just take its value if present, including if it is NULL
In passing I would also say like @Lennart that it is better to separate archive data from live data for many reasons, one would be performance, to compute the "next" version you will only have to read one row (the live one) and then doing something as above, instead of having to do things like
order by version desc limit 1
each time, which will be more costly, even with an index onversion
.