Postgresql – Postgres functional indexes that returns an array and used in joins

indexjoin;postgresql

I have a table that looks like this:

Value Table
(active), (version_no), (thing_id), (data)
------
true,       3,            7,          "active-value"
true,       2,            6,          "other-thing"
false,      2,            7,          "old-value"
false,      1,            7,          "first-value"

That is to say, each "thing" has one value. Every time the value is updated (only active values get updated), a trigger puts the old version into the table and marks it as inactive. I only want one version of each piece of data.

I would like to have another table reference the values. But it can reference it in one of two ways. It can either reference the "active" value for a thing or a fixed version.

Reference Table:
(version_no), (thing_id)
------
2,               7 -- Will always return "old-value"
3,               7 -- Will always return "active-value"
null,            7 -- Will return "active-value" until a new value is inserted...

I would like to have a single way to query the references based on the value they point to.

One way I thought of is to have a functional gist index f(active, thing_id, version) on the values table that would return a list of tuples that a reference could match on. For instance f(false, 2, 1) -> [(2, 1)] and f(true, 2, 3) -> [(2, 3), (2, null)]

Then check if

SELECT references.*
FROM references
INNER JOIN values
    ON (
        (reference.thing_id, reference.version) = 
        ANY(f(values.active, values.thing_id, values.version)
       )
WHERE values.data = "abc"

Am I on the right track? Is there a better way of having "dual references" to another table? Is there a more efficient way than an array?

Best Answer

To me it seems that the requirements are not very clear. And we don't have a definition (CREATE TABLE) of the two tables, so I'll first write my assumptions of what the tables are what they should be:

    CREATE TABLE value
      ( active boolean not null,                   
        version_no int not null, 
        thing_id int not null, 
        data text, 
        CONSTRAINT thing_version_UQ
            UNIQUE (thing_id, version_no)
      ) ;
  • Whether you have or not that UNIQUE constraint is not sure but from the text it seems that each time the value of data is changed for a "thing", a new version is inserted. And we can't have 2 rows with same thing and version so if the constraint isn't there, it should added.

  • There is only one active version per thing. It's not at all obvious if this restriction is enforced but it can be, with a partial index:

    CREATE UNIQUE INDEX only_one_active_version_per_thing_PUQ
        ON value
            (thing_id)
        WHERE (active) ;
    

Then we have the consideration for a new references table.

  • First, references is a reserved keyword. I strongly suggest that reserved words should not be used in a database, table, column (or any other object) name. The same goes for VALUES and VALUE. These are not reserved in Postgres but they are in SQL. I would find an insert like the following confusing at first sight:

    insert into values values
        (true,   3,  7,  'active-value'),
        (true,   2,  6,  'other-thing'),
        (false,  2,  7,  'old-value'),
        (false,  1,  7,  'first-value') ; 
    
  • For the actual suggestion, I don't see why you can't use simple queries. To request a specific thing and version:

    SELECT active, date
    FROM value
    WHERE (thing_id, version_no) = (2, 1) ;
    
  • and to request for the active version of a thing:

    SELECT version, date
    FROM value
    WHERE active 
      AND thing_id = 2 ;
    
  • You could easily write a query or a function which covers both cases, if you need to. But having an extra table doesn't serve any purpose in my opinion.