Postgresql – query for each row from another arbitrary query, and union results

postgresql

I'm trying to build a query that will take another query. Then, perform another query based on those results. Union the results so that there are no duplicates. All on the same table. I've seen some similar posts here and here, but the "input" query could be almost anything and the number of columns may change.

Here's a sample dataset:

  num  |  sub  |                      valid_time                      | col1 | col2  |  ...
-------+-------+------------------------------------------------------+------+-------+------
   1   |   1   | ["2018-12-31 01:00:00-00","2019-12-31 01:00:00-00")  |   X  |  ...  |  ...
   1   |   1   | ["2018-12-31 01:00:00-00","2019-12-31 01:00:00-00")  |   Y  |  ...  |  ...
   1   |   2   | ["2018-12-31 01:00:00-00","2019-12-31 01:00:00-00")  |   Z  |  ...  |  ...
   2   |   3   | ["2018-12-31 01:00:00-00","2019-12-31 01:00:00-00")  |   X  |  ...  |  ...
   2   |   4   | ["2018-12-31 01:00:00-00","2019-12-31 01:00:00-00")  |   Y  |  ...  |  ...
   2   |   3   | ["2018-12-31 01:00:00-00","2019-12-31 01:00:00-00")  |   Z  |  ...  |  ...

Based on some of the examples I was seeing with JOINs, a subset of columns were typically picked out. I'm not sure how to return all columns with a JOIN. The number of columns may change over time, and all the columns need to be returned. I would rather not have to update this query if a new column is added.

This does not work (I think because I need to JOIN somehow), but it shows where my head is at.

with cte as (
    select *
    from table
    where col1 = 'X' and 
    now() <@ valid_time
)
select * from cte
union
select * from table
    where table.num = cte.num and 
    table.sub = cte.sub and 
    now() <@ valid_time;

I mentioned earlier that the "input" query could be almost anything. I plan to templatized similar to:

with cte as (%(query)s)
select * from cte
union
select * from table
    where table.num = cte.num and 
    table.sub = cte.sub and 
    %(query_time_part)s;

Finally based on the example query, I would expect something like so:

  num  |  sub  |                      valid_time                      | col1 | col2  |  ...
-------+-------+------------------------------------------------------+------+-------+------
   1   |   1   | ["2018-12-31 01:00:00-00","2019-12-31 01:00:00-00")  |   X  |  ...  |  ...
   1   |   1   | ["2018-12-31 01:00:00-00","2019-12-31 01:00:00-00")  |   Y  |  ...  |  ...
   2   |   3   | ["2018-12-31 01:00:00-00","2019-12-31 01:00:00-00")  |   X  |  ...  |  ...
   2   |   3   | ["2018-12-31 01:00:00-00","2019-12-31 01:00:00-00")  |   Z  |  ...  |  ...

Best Answer

How about:

with cte as (%(query))
select unique table.* from table join cte using (num,sub)
    where %(query_time_part);

I'm assuming here that "%(query_time_part)" is written such that it applies to the literal "table", while "%(query)" already contains the analogous code which applies to the "table" underlying "cte".

The rows that make up the cte should automatically meet the join condition to self-join, so should not need any special steps, like a UNION, for them to be emitted in their own right. That is, the version of them that get emitted is the version that comes from "table", not from "cte", but since they are identical it doesn't matter.