Here's a different approach for which recomputing the calculated_price
is just an optimization, as opposed to being strictly necessary.
Suppose that in the currencies
tables, you add another column, last_rate
, which contains the exchange rate at the time the calculated_price
was last updated, no matter when this happened.
To quickly retrieve a set of products with a price point between, say, 50 USD and 100 USD that include the desired results, you could do something like that:
SELECT * FROM products
WHERE calculated_price > 50.0/(:last_rate*
(SELECT coalesce(max(value/last_rate),1) FROM currencies
WHERE value>last_rate))
AND calculated_price < 100.0/ (:last_rate*
(SELECT coalesce(min(value/last_rate),1) FROM currencies
WHERE value<last_rate))
where :last_rate
contains the EUR/USD exchange rate at the time of the last update. The idea is to increase the interval to take into account the maximum variation of every currency. The increase factors for both ends of the interval are constant between rates updates, so they could be pre-computed.
Since the rates change only slightly over short periods of time, the above query is likely to give a close approximation of the final result. To get the final result, let's filter out the products for which the prices have slipped out of the bounds due to the changes in rates since the last update of calculated_price
:
WITH p AS (
SELECT * FROM products
WHERE calculated_price > 50.0/(:last_rate*
(SELECT coalesce(max(value/last_rate),1) FROM currencies
WHERE value>last_rate))
AND calculated_price < 100.0/ (:last_rate*
(SELECT coalesce(min(value/last_rate),1) FROM currencies
WHERE value<last_rate))
)
SELECT price,c.value FROM p join currencies c on (p.currency=c.id)
WHERE price/c.value>50/:current_rate
AND price/c.value<100/:current_rate;
where :current_rate
is the more up-to-date rate with EUR for the money choosen by the user.
The efficiency comes from the fact that the range of rates is supposed to be small, the values being close together.
You can always implement your own table serving as "materialized view". That's how we did it before MATERIALIZED VIEW
was implemented in Postgres 9.3.
You can create a plain VIEW
:
CREATE VIEW graph_avg_view AS
SELECT xaxis, AVG(value) AS avg_val
FROM graph
GROUP BY xaxis;
And materialize the result once or whenever you need to start over:
CREATE TABLE graph_avg AS
SELECT * FROM graph_avg_view;
(Or use the SELECT
statement directly, without creating a VIEW
.)
Then, depending on undisclosed details of your use case, you can DELETE
/ UPDATE
/ INSERT
changes manually.
A basic DML statement with data-modifying CTEs for your table as is:
Assuming nobody else tries to write to graph_avg
concurrently (reading is no problem):
WITH del AS (
DELETE FROM graph_avg t
WHERE NOT EXISTS (SELECT FROM graph_avg_view WHERE xaxis = t.xaxis)
)
, upd AS (
UPDATE graph_avg t
SET avg_val = v.avg_val
FROM graph_avg_view v
WHERE t.xaxis = v.xaxis
AND t.avg_val <> v.avg_val
-- AND t.avg_val IS DISTINCT FROM v.avg_val -- alt if avg_val can be NULL
)
INSERT INTO graph_avg t -- no target list, whole row
SELECT v.*
FROM graph_avg_view v
WHERE NOT EXISTS (SELECT FROM graph_avg WHERE xaxis = v.xaxis);
Basic recipe
- Add a
timestamp
column with default now()
to your base table. Let's call it ts
.
- If you have updates, add a trigger to set the current timestamp with every update that changes either
xaxis
or value
.
Create a tiny table to remember the timestamp of your latest snapshot. Let's call it mv
:
CREATE TABLE mv (
tbl text PRIMARY KEY
, ts timestamp NOT NULL DEFAULT '-infinity'
); -- possibly more details
Create this partial, multicolumn index:
CREATE INDEX graph_mv_latest ON graph (xaxis, value)
WHERE ts >= '-infinity';
Use the timestamp of the last snapshot as predicate in your queries to refresh the snapshot with perfect index usage.
At the end of the transaction, drop the index and recreate it with the transaction timestamp replacing the timestamp in the index predicate (initially '-infinity'
), which you also save to your table. Everything in one transaction.
Note that the partial index is great to cover INSERT
and UPDATE
operations, but not DELETE
. To cover that, you need to consider the entire table. It all depends on exact requirements.
Best Answer
You want to move the computation from the column, to the other side of the inequality operators where you have literals rather than a column:
Then a simple index on "birthday" will make it efficient. It might need some fiddling depending on what datatype "birthday" is. I tested it as a
timestamp without time zone
.