PostgreSQL – Automatically Update View Based on Daily Created Tables

database-designpartitioningpostgresql

Let's say I have a table that is created daily:

20181227
20181226
20181225
...

There is view that is like:

CREATE OR REPLACE VIEW someview
AS 
SELECT * FROM 20181227
UNION
SELECT * FROM 20181226
UNION
SELECT * FROM 20181225
...

Is there a way to replace that with another code that automatically grabs data from all those tables without having to make a union for each one? Right now there is a bash script that runs daily and regenerates the view with a new union daily, but that's inelegant.

Best Answer

Look to table partitioning. Ideally, use the latest version of Postgres (currently Postgres 11) since there have been major improvements in Postgres 10 and 11.

Postgres 11 allows RANGE, LIST and HASH partitioning. One big partition per day would suggest LIST partitioning based on a date column like:

CREATE TABLE foo (
  foo_date date NOT NULL
, foo_id   bigint NOT NULL GENERATED ALWAYS AS IDENTITY
, data     text
, PRIMARY KEY (foo_date, foo_id)
)
PARTITION BY LIST (foo_date);

CREATE TABLE foo_20181226 PARTITION OF foo FOR VALUES IN ('20181226');
CREATE TABLE foo_20181227 PARTITION OF foo FOR VALUES IN ('20181227');
-- etc.

(Alternatively, you might have a timestamp or timestamptz column and use RANGE partitioning for that.)

Then you can query the master table directly to automatically include all partitions:

SELECT * FROM foo;

Or (since your main concern seems to be short syntax):

TABLE foo;

See:

Various optimizations are possible, with constraints, indexes, column defaults etc. depending on requirement details.

Be sure to read the linked chapter of the manual to understand various pros and cons. Yours should be the perfect use case (unless undisclosed requirements are in the way). In particular, you can easily and very quickly add and remove partitions with minimum interference with the rest of the table.

There are still limitations in the current implementation. In particular, partition pruning is great to improve performance, but there is room for improvement (currently in development). That said, it's probably going to nuke the performance of your view for queries that don't need to involve all tables (partitions), since the view is going to consider all union'ed tables every time. A potentially faster alternative for certain queries is (currently) picking relevant tables (partitions) in a custom UNION ALL query.

And don't use names consisting of only digits like in your example 20181225. Use legal, lower-case names starting with a letter, or you have to double quote the identifier at all times.

SQL identifiers and key words must begin with a letter (a-z, but also letters with diacritical marks and non-Latin letters) or an underscore (_). Subsequent characters in an identifier or key word can be letters, underscores, digits (0-9), or dollar signs ($). Note that dollar signs are not allowed in identifiers according to the letter of the SQL standard, so their use might render applications less portable. The SQL standard will not define a key word that contains digits or starts or ends with an underscore, so identifiers of this form are safe against possible conflict with future extensions of the standard.

Related: