(Apologies for answering my own question, but after reading this unrelated question and answer, it occurred to me I should try using a CTE instead. It works.)
Here is another view, similar to testview1
in the question, but that uses a Common Table Expression:
CREATE VIEW testview2 AS
WITH testcte AS (SELECT
t1.id,
t1.log_timestamp,
(SELECT SUM(t2.col1) FROM test2 t2 WHERE t2.test1_id=t1.id AND category='A') AS a,
(SELECT SUM(t2.col2) FROM test2 t2 WHERE t2.test1_id=t1.id AND category='B') AS b,
(SELECT SUM(t2.col1 - t2.col2) FROM test2 t2 WHERE t2.test1_id=t1.id AND category='C') AS c
FROM test1 t1)
SELECT * FROM testcte;
(This is just an example, I'm not suggesting that combining a view and a CTE are a necessarily a good idea: a CTE might be enough.)
Unlike testview1
, the query plan for SELECT a FROM testview2
now also computes b
and c
, which were ignored since unused in testview1
:
Subquery Scan testview2 (cost=395272.42..395535.25 rows=8761 width=8) (actual time=0.256..607.941 rows=8761 loops=1)
-> CTE Scan on testcte (cost=395272.42..395447.64 rows=8761 width=36) (actual time=0.255..604.106 rows=8761 loops=1)
CTE testcte
-> Seq Scan on test1 t1 (cost=0.00..395272.42 rows=8761 width=12) (actual time=0.252..589.358 rows=8761 loops=1)
SubPlan 1
-> Aggregate (cost=15.02..15.03 rows=1 width=4) (actual time=0.021..0.021 rows=1 loops=8761)
-> Bitmap Heap Scan on test2 t2 (cost=4.28..15.02 rows=1 width=4) (actual time=0.015..0.015 rows=0 loops=8761)
Recheck Cond: (test1_id = $0)
Filter: ((category)::text = 'A'::text)
-> Bitmap Index Scan on test_if_idx (cost=0.00..4.28 rows=3 width=0) (actual time=0.009..0.009 rows=3 loops=8761)
Index Cond: (test1_id = $0)
SubPlan 2
-> Aggregate (cost=15.02..15.03 rows=1 width=4) (actual time=0.019..0.019 rows=1 loops=8761)
-> Bitmap Heap Scan on test2 t2 (cost=4.28..15.02 rows=1 width=4) (actual time=0.012..0.012 rows=0 loops=8761)
Recheck Cond: (test1_id = $0)
Filter: ((category)::text = 'B'::text)
-> Bitmap Index Scan on test_if_idx (cost=0.00..4.28 rows=3 width=0) (actual time=0.007..0.007 rows=3 loops=8761)
Index Cond: (test1_id = $0)
SubPlan 3
-> Aggregate (cost=15.02..15.04 rows=1 width=8) (actual time=0.020..0.020 rows=1 loops=8761)
-> Bitmap Heap Scan on test2 t2 (cost=4.28..15.02 rows=1 width=8) (actual time=0.013..0.014 rows=0 loops=8761)
Recheck Cond: (test1_id = $0)
Filter: ((category)::text = 'C'::text)
-> Bitmap Index Scan on test_if_idx (cost=0.00..4.28 rows=3 width=0) (actual time=0.007..0.007 rows=3 loops=8761)
Index Cond: (test1_id = $0)
However, it doesn't re-compute the results that are used multiple times within the same query (which was the objective).
Unlike testview1
with which SELECT a, a, a, a, a
took 5 times longer than SELECT a
, here SELECT a, a, a, a, a, b, c, a+b, a+c, b+c FROM testview2
takes just as long as SELECT a FROM testview2
or SELECT a, b, c FROM testview2
. It only goes through a
, b
and c
once:
Subquery Scan testview2 (cost=395272.42..395600.96 rows=8761 width=24) (actual time=0.147..562.790 rows=8761 loops=1)
-> CTE Scan on testcte (cost=395272.42..395447.64 rows=8761 width=36) (actual time=0.144..554.194 rows=8761 loops=1)
CTE testcte
-> Seq Scan on test1 t1 (cost=0.00..395272.42 rows=8761 width=12) (actual time=0.140..542.657 rows=8761 loops=1)
SubPlan 1
-> Aggregate (cost=15.02..15.03 rows=1 width=4) (actual time=0.019..0.019 rows=1 loops=8761)
-> Bitmap Heap Scan on test2 t2 (cost=4.28..15.02 rows=1 width=4) (actual time=0.012..0.013 rows=0 loops=8761)
Recheck Cond: (test1_id = $0)
Filter: ((category)::text = 'A'::text)
-> Bitmap Index Scan on test_if_idx (cost=0.00..4.28 rows=3 width=0) (actual time=0.007..0.007 rows=3 loops=8761)
Index Cond: (test1_id = $0)
SubPlan 2
-> Aggregate (cost=15.02..15.03 rows=1 width=4) (actual time=0.019..0.019 rows=1 loops=8761)
-> Bitmap Heap Scan on test2 t2 (cost=4.28..15.02 rows=1 width=4) (actual time=0.012..0.012 rows=0 loops=8761)
Recheck Cond: (test1_id = $0)
Filter: ((category)::text = 'B'::text)
-> Bitmap Index Scan on test_if_idx (cost=0.00..4.28 rows=3 width=0) (actual time=0.006..0.006 rows=3 loops=8761)
Index Cond: (test1_id = $0)
SubPlan 3
-> Aggregate (cost=15.02..15.04 rows=1 width=8) (actual time=0.018..0.019 rows=1 loops=8761)
-> Bitmap Heap Scan on test2 t2 (cost=4.28..15.02 rows=1 width=8) (actual time=0.012..0.012 rows=0 loops=8761)
Recheck Cond: (test1_id = $0)
Filter: ((category)::text = 'C'::text)
-> Bitmap Index Scan on test_if_idx (cost=0.00..4.28 rows=3 width=0) (actual time=0.007..0.007 rows=3 loops=8761)
Index Cond: (test1_id = $0)
Here is my reading of that section. I will admit it is confusing.
Suppose I have two tables:
CREATE TABLE batch (
id serial not null unique,
control_code text primary key,
date_posted date not null default now()
);
CREATE TABLE details (
batch_id int not null references batch(id),
description text,
primary key(batch_id, description)
);
Now, suppose we insert batch and details records in different transactions. Session 1 inserts a batch and starts to insert details but before it finishes, session 2 starts up. Session 2 gets to see the batch heading info, but does not wait on the commit on details to inform the user that no records are found. Now if your batch and details are entirely in the same transaction then this never is a problem.
this would differ from serializable where you would expect to wait for the previous insert to complete and commit or rollback before determining whether to notify the user that no rows were found.
Best Answer
Basically,
NULL
is causing this, becauseNULL<>NULL
. One of the columns in your self-joined table will be allNULL
s.Here's a little test case that shows why this can happen. Naughty
NULL
equality and the wayNATURAL JOIN
works, picking column names to join on for you.Setup:
Queries:
The first query will give 2, the second query 1.
Your problem will just be a more advanced version of this.
Use
JOIN .... USING
with the columns of your own choice to workaround this.