Postgresql – How to generate a pivoted CROSS JOIN where the resulting table definition is unknown

dynamic-sqlpivotpostgresql

Given two tables with an undefined row count with a name and value, how would I display a pivoted CROSS JOIN of a function over their values.

CREATE TEMP TABLE foo AS
SELECT x::text AS name, x::int
FROM generate_series(1,10) AS t(x);

CREATE TEMP TABLE bar AS
SELECT x::text AS name, x::int
FROM generate_series(1,5) AS t(x);

For example, if that function were multiplication, how would I generate a (multiplication) table like the one below,

All of those (arg1,arg2,result) rows can be generated with

SELECT foo.name AS arg1, bar.name AS arg2, foo.x*bar.x AS result
FROM foo
CROSS JOIN bar;

So this is only a question of presentation, I would like this to also work with a custom name — a name that is not simply the argument CASTed to text but set in the table,

CREATE TEMP TABLE foo AS
SELECT chr(x+64) AS name, x::int
FROM generate_series(1,10) AS t(x);

CREATE TEMP TABLE bar AS
SELECT chr(x+72) AS name, x::int
FROM generate_series(1,5) AS t(x);

I think this would be easily do-able with a CROSSTAB capable of a dynamic return-type.

SELECT * FROM crosstab(
  '
    SELECT foo.x AS arg1, bar.x AS arg2, foo.x*bar.x
    FROM foo
    CROSS JOIN bar
  ', 'SELECT DISTINCT name FROM bar'
) AS **MAGIC**

But, without the **MAGIC**, I get

ERROR:  a column definition list is required for functions returning "record"
LINE 1: SELECT * FROM crosstab(

For reference, using the above examples with names this is something more like what tablefunc's crosstab() wants.

SELECT * FROM crosstab(
  '
    SELECT foo.x AS arg1, bar.x AS arg2, foo.x*bar.x
    FROM foo
    CROSS JOIN bar
  '
) AS t(row int, i int, j int, k int, l int, m int);

But, now we're back to making assumptions about the content and size of the bar table in our example. So if,

The tables are of undefined length,
Then the cross-join represents a cube of undefined dimension (because of above),
The catagory-names (cross-tab parlance) are in the table

What's the best we can do in PostgreSQL without a "column definition list" to generate that kind of presentation?

Best Answer

Simple case, static SQL

The non-dynamic solution with crosstab() for the simple case:

SELECT * FROM crosstab(
  'SELECT b.x, f.name, f.x * b.x AS prod
   FROM   foo f, bar b
   ORDER  BY 1, 2'
   ) AS ct (x int, "A" int, "B" int, "C" int, "D" int, "E" int
                 , "F" int, "G" int, "H" int, "I" int, "J" int);

I order resulting columns by foo.name, not foo.x. Both happen to be sorted in parallel, but that's just the simple setup. Pick the right sort order for your case. The actual value of the second column is irrelevant in this query (1-parameter form of crosstab()).

We don't even need crosstab() with 2 parameters because there are no missing values by definition. See:

PostgreSQL Crosstab Query

(You fixed the crosstab query in the question by replacing foo with bar in a later edit. This also fixes the query, but keeps working with names from foo.)

Unknown return type, dynamic SQL

Column names and types cannot be dynamic. SQL demands to know number, names and types of resulting columns at call time. Either by explicit declaration or from information in the system catalogs (That's what happens with SELECT * FROM tbl: Postgres looks up the registered table definition.)

You want Postgres to derive resulting columns from data in a user table. Not going to happen.

One way or the other, you need two round trips to the server. Either you create a cursor and then walk through it. Or you create a temp table and then select from it. Or you register a type and use it in the call.

Or you simply generate the query in one step and execute it in the next:

SELECT $$SELECT * FROM crosstab(
  'SELECT b.x, f.name, f.x * b.x AS prod
   FROM   foo f, bar b
   ORDER  BY 1, 2'
   ) AS ct (x int, $$
 || string_agg(quote_ident(name), ' int, ' ORDER BY name) || ' int)'
FROM   foo;

This generates the query above, dynamically. Execute it in the next step.

I am using dollar-quotes ($$) to keep handling of nested quotes simple. See:

Insert text with single quotes in PostgreSQL

quote_ident() is essential to escape otherwise illegal column names (and possibly defend against SQL injection).

Postgresql – How to change the mount point for a column in postgresql table

Maybe I'm missing something, but it sounds as if that is a simple case of replacing the value:

update the_table
   set link = replace(link, '/ABC/ABC_DATA/Test/', '/CBF/CBF_DATA/Documents/Test_DATA/');

Postgresql – Using table inheritance instead of mapping tables

Inheritance is one of those features that I wouldn't touch. AFAIK, it's used internally for replication and partitioning in some capacity. I'm not sure if it was even designed with the intent to be used by the end-user.

Concrete Technical Drawbacks

Drawbacks on UNIQUE and REFERENCES

The docs covers some of the drawbacks in the CAVEAT section (below is important).

If we declared parent.name to be UNIQUE or a PRIMARY KEY, this would not stop the child table from having rows with names duplicating rows in parent. And those duplicate rows would by default show up in queries from parent. In fact, by default child would have no unique constraint at all, and so could contain multiple rows with the same name. You could add a unique constraint to child, but this would not prevent duplication compared to parent.
Similarly, if we were to specify that parent.name REFERENCES some other table, this constraint would not automatically propagate to child. In this case you could work around it by manually adding the same REFERENCES constraint to child.
Specifying that another table's column REFERENCES parent(name) would allow the other table to contain parent names, but not child names. There is no good workaround for this case.

Slow progress developing INHERITs

These deficiencies were first mentioned in the docs to 7.3 released in 1996 though they existed since inheritance was implemented

This deficiency will probably be fixed in some future release.

And the only change was to make the deficiencies more explicit and verbose in the docs to 8.0 released in 2010.

These deficiencies will probably be fixed in some future release, but in the meantime considerable care is needed in deciding whether inheritance is useful for your problem.

Good luck waiting for that some future release. And, some of the things you talk about features just aren't unique to composition,

Saving a "key" is moot

no surrogate key on 'sometype', it's explicitly a composition

How is that different from making sometype an attribute list, and linking directly to it?

CREATE TABLE sometype (sometype_name text PRIMARY KEY);
CREATE TABLE foo (foo_id serial PRIMARY KEY);
CREATE TABLE foo_sometype (
  foo_id int REFERENCES foo,
  sometype_name text REFERENCES sometype,
  PRIMARY KEY ( foo_id, sometype_name )
);

Now you don't even have to join foo_sometype to sometype to get sometype.sometype_name.

Table Partitioning

All of those problems aside, it gets even worse with the upcoming PostgreSQL 10 release of table partitioning

Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed

So you want inheritance? Forgo partitioning, which actually has real planner advantages.

ALTER TABLE

Alas, ALTER TABLE has quite a few drawbacks listed in its notes as well,

If a table has any descendant tables, it is not permitted to add, rename, or change the type of a column, or rename an inherited constraint in the parent table without doing the same to the descendants. That is, ALTER TABLE ONLY will be rejected. This ensures that the descendants always have columns matching the parent. [...] A recursive DROP COLUMN operation will remove a descendant table's column only if the descendant does not inherit that column from any other parents and never had an independent definition of the column. A nonrecursive DROP COLUMN (i.e., ALTER TABLE ONLY ... DROP COLUMN) never removes any descendant columns, but instead marks them as independently defined rather than inherited. [...] The TRIGGER, CLUSTER, OWNER, and TABLESPACE actions never recurse to descendant tables; that is, they always act as though ONLY were specified. Adding a constraint recurses only for CHECK constraints that are not marked NO INHERIT.

Conclusion

I don't think many people use inheritance. I've never seen it in the wild. Inheritance in the db adds to the learning curve and some features are just better left alone. You don't have to find an application for them.

You may find this post on Stack Overflow useful, "When to use inherited tables in PostgreSQL?".