You can always implement your own table serving as "materialized view". That's how we did it before MATERIALIZED VIEW
was implemented in Postgres 9.3.
You can create a plain VIEW
:
CREATE VIEW graph_avg_view AS
SELECT xaxis, AVG(value) AS avg_val
FROM graph
GROUP BY xaxis;
And materialize the result once or whenever you need to start over:
CREATE TABLE graph_avg AS
SELECT * FROM graph_avg_view;
(Or use the SELECT
statement directly, without creating a VIEW
.)
Then, depending on undisclosed details of your use case, you can DELETE
/ UPDATE
/ INSERT
changes manually.
A basic DML statement with data-modifying CTEs for your table as is:
Assuming nobody else tries to write to graph_avg
concurrently (reading is no problem):
WITH del AS (
DELETE FROM graph_avg t
WHERE NOT EXISTS (SELECT FROM graph_avg_view WHERE xaxis = t.xaxis)
)
, upd AS (
UPDATE graph_avg t
SET avg_val = v.avg_val
FROM graph_avg_view v
WHERE t.xaxis = v.xaxis
AND t.avg_val <> v.avg_val
-- AND t.avg_val IS DISTINCT FROM v.avg_val -- alt if avg_val can be NULL
)
INSERT INTO graph_avg t -- no target list, whole row
SELECT v.*
FROM graph_avg_view v
WHERE NOT EXISTS (SELECT FROM graph_avg WHERE xaxis = v.xaxis);
Basic recipe
- Add a
timestamp
column with default now()
to your base table. Let's call it ts
.
- If you have updates, add a trigger to set the current timestamp with every update that changes either
xaxis
or value
.
Create a tiny table to remember the timestamp of your latest snapshot. Let's call it mv
:
CREATE TABLE mv (
tbl text PRIMARY KEY
, ts timestamp NOT NULL DEFAULT '-infinity'
); -- possibly more details
Create this partial, multicolumn index:
CREATE INDEX graph_mv_latest ON graph (xaxis, value)
WHERE ts >= '-infinity';
Use the timestamp of the last snapshot as predicate in your queries to refresh the snapshot with perfect index usage.
At the end of the transaction, drop the index and recreate it with the transaction timestamp replacing the timestamp in the index predicate (initially '-infinity'
), which you also save to your table. Everything in one transaction.
Note that the partial index is great to cover INSERT
and UPDATE
operations, but not DELETE
. To cover that, you need to consider the entire table. It all depends on exact requirements.
As a_horse_with_no_name said in a comment:
No, that's not possible. You need some kind of scheduler that runs refresh materialized view e.g. pg_cron or something on the operating system level – a_horse_with_no_name
Alternatively, if you need a MATERIALIZED VIEW
that refreshes when you run SELECT
, just remove MATERIALIZED
and use a regular VIEW
. Materialization only adds a periodic cache. It's only needed when the query itself is prohibitively slow or hot.
Best Answer
Table statistics are not updated automatically.
They remain unchanged if refreshed with the
CONCURRENTLY
option.Without
CONCURRENTLY
, a new file is written for the table, and the basic countsrelpages
andreltuples
in the system catalogpg_class
are consequently reset to0
. Table statistics (per column) inpg_statistic
still remain unchanged, though.I ran a quick test in Postgres 9.6 to confirm.
In some cases (where query plans don't depend on current statistics much) you may want to save the time and not run
ANALYZE
at all. In all other cases you'll want / need to schedule a manualANALYZE myview;
right after theREFRESH
. Or you wait until autovacuum kicks in, which only happens if enough rows were changed (CONCURRENTLY
actually runs DDL commands on the MV to only affect changed rows.)