Apparently, you want to see constraint exclusion at doing its work. As in similar setups the parent table has no constraint (and, just to mention it, usually no rows), it will be always visited by queries on the parent table. Using your example schema, see the output:
test=# EXPLAIN SELECT count(1) FROM p;
QUERY PLAN
───────────────────────────────────────────────────────────────────
Aggregate (cost=576.00..576.01 rows=1 width=0)
-> Append (cost=0.00..501.00 rows=30001 width=0)
-> Seq Scan on p (cost=0.00..0.00 rows=1 width=0)
-> Seq Scan on a (cost=0.00..164.00 rows=10000 width=0)
-> Seq Scan on b (cost=0.00..164.00 rows=10000 width=0)
-> Seq Scan on c (cost=0.00..173.00 rows=10000 width=0)
test=# EXPLAIN SELECT count(1) FROM p WHERE type_id = 1;
QUERY PLAN
───────────────────────────────────────────────────────────────────
Aggregate (cost=412.00..412.01 rows=1 width=0)
-> Append (cost=0.00..387.00 rows=10002 width=0)
-> Seq Scan on p (cost=0.00..0.00 rows=1 width=0)
Filter: (type_id = 1)
-> Seq Scan on a (cost=0.00..189.00 rows=10000 width=0)
Filter: (type_id = 1)
-> Seq Scan on c (cost=0.00..198.00 rows=1 width=0)
Filter: (type_id = 1)
In the second case, table b
is not visited, because the planner knows for sure that there cannot be any rows matching the condition. At the same time, p
is present in both cases. The same applies to c
, too: it has no easy way (the check) to exclude the table from the plan. In this case, it has to visit all tables that have the possibility to contain the actual rows.
Let's add some check there, too:
ALTER TABLE c ADD CHECK (type_id = 3);
test=# EXPLAIN SELECT count(1) FROM p WHERE type_id = 1;
QUERY PLAN
───────────────────────────────────────────────────────────────────
Aggregate (cost=214.00..214.01 rows=1 width=0)
-> Append (cost=0.00..189.00 rows=10001 width=0)
-> Seq Scan on p (cost=0.00..0.00 rows=1 width=0)
Filter: (type_id = 1)
-> Seq Scan on a (cost=0.00..189.00 rows=10000 width=0)
Filter: (type_id = 1)
Much better now.
Now let's have a look on the locks that are taken.
BEGIN; -- to be able to see the locks
SELECT count(1) FROM p WHERE type_id = 1;
SELECT relname, locktype, mode
FROM pg_locks
JOIN pg_class c ON relation = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE nspname = 'dba'
AND relkind = 'r';
relname │ locktype │ mode
─────────┼──────────┼─────────────────
c │ relation │ AccessShareLock
b │ relation │ AccessShareLock
a │ relation │ AccessShareLock
p │ relation │ AccessShareLock
This is not so nice... Apparently (and somewhat surprisingly), going to the parent table locks all children, too.
But:
ROLLBACK;
BEGIN; -- to be able to see the locks
SELECT count(1) FROM a WHERE type_id = 1; -- touching only "a"
-- the locks taken:
a │ relation │ AccessShareLock
This means that if you can constraint your query to touch only a subset of the tables, it will save you the locks on the other children. Playing around a bit it shows that this applies to other statements (for example, an UPDATE
), too, just the lock modes are different.
In short, No.
One of the statistics which PostgreSQL collects is the correlation between the values of a column (actually the ranks of those values) versus the location of that row in the table. You can find that result in pg_stats.correlation
for your specific tablename
and colname
.
However, the particular query you give is not one of the situations where this information is used. The primary use is to estimate how much of the IO done for the table-lookup step of a large index range scan (e.g. "WHERE state='Ca'", or "WHERE age is between 50 and 65") will be sequential-like IO (either really sequential, or close enough to sequential so that read-ahead still triggers, or falling into the same table-block as the previous row so requiring no IO at all) rather than random IO.
Also, since your relation is not monotonic but rather more like a parabola, the estimated correlation will undervalue the strength of the correlation in your case, as it only looks at linear correlations.
Best Answer
For the discussion, let's assume your execution plan looks like
I also assume that you are using the default
READ COMMITTED
isolation level.Then PostgreSQL will sequentially read to the table.
Whenever it finds a row that matches the filter, that row will be locked and updated.
If locking a row is blocked by a concurrent query, PostgreSQL waits until the lock goes away. Then it re-evaluates the filter condition and either moves on (if the condition no longer applies on account of a concurrent modification) or it locks and updates the modified row.
See the documentation:
In particular, it is possible that two
UPDATE
statements that each modify several rows deadlock with each other, since they acquire locks as the proceed and locks are always held until the end of the transaction.