I have the following tables (taken from the Sakila database):
- film: film_id is pkey
- actor: actor_id is pkey
- film_actor: film_id and actor_id are fkeys to film/actor
I am selecting a particular film. For this film, I also want all actors participating in that film. I have two queries for this: one with a LEFT JOIN
and one with a LEFT JOIN LATERAL
.
select film.film_id, film.title, a.actors
from film
left join
(
select film_actor.film_id, array_agg(first_name) as actors
from actor
inner join film_actor using(actor_id)
group by film_actor.film_id
) as a
on a.film_id = film.film_id
where film.title = 'ACADEMY DINOSAUR'
order by film.title;
select film.film_id, film.title, a.actors
from film
left join lateral
(
select array_agg(first_name) as actors
from actor
inner join film_actor using(actor_id)
where film_actor.film_id = film.film_id
) as a
on true
where film.title = 'ACADEMY DINOSAUR'
order by film.title;
When comparing the query plan, the first query performs much worse (20x) than the second:
Merge Left Join (cost=507.20..573.11 rows=1 width=51) (actual time=15.087..15.089 rows=1 loops=1)
Merge Cond: (film.film_id = film_actor.film_id)
-> Sort (cost=8.30..8.31 rows=1 width=19) (actual time=0.075..0.075 rows=1 loops=1)
Sort Key: film.film_id
Sort Method: quicksort Memory: 25kB
-> Index Scan using idx_title on film (cost=0.28..8.29 rows=1 width=19) (actual time=0.044..0.058 rows=1 loops=1)
Index Cond: ((title)::text = 'ACADEMY DINOSAUR'::text)
-> GroupAggregate (cost=498.90..552.33 rows=997 width=34) (actual time=15.004..15.004 rows=1 loops=1)
Group Key: film_actor.film_id
-> Sort (cost=498.90..512.55 rows=5462 width=8) (actual time=14.934..14.937 rows=11 loops=1)
Sort Key: film_actor.film_id
Sort Method: quicksort Memory: 449kB
-> Hash Join (cost=6.50..159.84 rows=5462 width=8) (actual time=0.355..8.359 rows=5462 loops=1)
Hash Cond: (film_actor.actor_id = actor.actor_id)
-> Seq Scan on film_actor (cost=0.00..84.62 rows=5462 width=4) (actual time=0.035..2.205 rows=5462 loops=1)
-> Hash (cost=4.00..4.00 rows=200 width=10) (actual time=0.303..0.303 rows=200 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 17kB
-> Seq Scan on actor (cost=0.00..4.00 rows=200 width=10) (actual time=0.027..0.143 rows=200 loops=1)
Planning time: 1.495 ms
Execution time: 15.426 ms
Nested Loop Left Join (cost=25.11..33.16 rows=1 width=51) (actual time=0.849..0.854 rows=1 loops=1)
-> Index Scan using idx_title on film (cost=0.28..8.29 rows=1 width=19) (actual time=0.045..0.048 rows=1 loops=1)
Index Cond: ((title)::text = 'ACADEMY DINOSAUR'::text)
-> Aggregate (cost=24.84..24.85 rows=1 width=32) (actual time=0.797..0.797 rows=1 loops=1)
-> Hash Join (cost=10.82..24.82 rows=5 width=6) (actual time=0.672..0.764 rows=10 loops=1)
Hash Cond: (film_actor.actor_id = actor.actor_id)
-> Bitmap Heap Scan on film_actor (cost=4.32..18.26 rows=5 width=2) (actual time=0.072..0.150 rows=10 loops=1)
Recheck Cond: (film_id = film.film_id)
Heap Blocks: exact=10
-> Bitmap Index Scan on idx_fk_film_id (cost=0.00..4.32 rows=5 width=0) (actual time=0.041..0.041 rows=10 loops=1)
Index Cond: (film_id = film.film_id)
-> Hash (cost=4.00..4.00 rows=200 width=10) (actual time=0.561..0.561 rows=200 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 17kB
-> Seq Scan on actor (cost=0.00..4.00 rows=200 width=10) (actual time=0.039..0.275 rows=200 loops=1)
Planning time: 1.722 ms
Execution time: 1.087 ms
Why is this? I want to learn to reason about this, so I can understand what is going on and can predict how the query will behave when data size increases and which decisions the planner will make under certain conditions.
My thoughts: in the first LEFT JOIN
query, it looks like the subquery is executed for all films in the database, without taking into account the filtering in the outer query that we are only interested in one particular film. Why is the planner not able to have that knowledge in the subquery?
In the LEFT JOIN LATERAL
query, we are more or less 'pushing' that filtering downwards. So the issue we had in the first query is not present here, hence the better performance.
I guess I am mainly looking for rule of thumbs, general wisdoms, … so this planner magic becomes second nature – if that makes sense.
update (1)
Rewriting the LEFT JOIN
as following also gives better performance (slightly better than the LEFT JOIN LATERAL
):
select film.film_id, film.title, array_agg(a.first_name) as actors
from film
left join
(
select film_actor.film_id, actor.first_name
from actor
inner join film_actor using(actor_id)
) as a
on a.film_id = film.film_id
where film.title = 'ACADEMY DINOSAUR'
group by film.film_id
order by film.title;
GroupAggregate (cost=29.44..29.49 rows=1 width=51) (actual time=0.470..0.471 rows=1 loops=1)
Group Key: film.film_id
-> Sort (cost=29.44..29.45 rows=5 width=25) (actual time=0.428..0.430 rows=10 loops=1)
Sort Key: film.film_id
Sort Method: quicksort Memory: 25kB
-> Nested Loop Left Join (cost=4.74..29.38 rows=5 width=25) (actual time=0.149..0.386 rows=10 loops=1)
-> Index Scan using idx_title on film (cost=0.28..8.29 rows=1 width=19) (actual time=0.056..0.057 rows=1 loops=1)
Index Cond: ((title)::text = 'ACADEMY DINOSAUR'::text)
-> Nested Loop (cost=4.47..19.09 rows=200 width=8) (actual time=0.087..0.316 rows=10 loops=1)
-> Bitmap Heap Scan on film_actor (cost=4.32..18.26 rows=5 width=4) (actual time=0.052..0.089 rows=10 loops=1)
Recheck Cond: (film_id = film.film_id)
Heap Blocks: exact=10
-> Bitmap Index Scan on idx_fk_film_id (cost=0.00..4.32 rows=5 width=0) (actual time=0.035..0.035 rows=10 loops=1)
Index Cond: (film_id = film.film_id)
-> Index Scan using actor_pkey on actor (cost=0.14..0.17 rows=1 width=10) (actual time=0.011..0.011 rows=1 loops=10)
Index Cond: (actor_id = film_actor.actor_id)
Planning time: 1.833 ms
Execution time: 0.706 ms
How can we reason about this?
update (2)
I continued with some experiments and I think an interesting rule of thumb is: apply the aggregate function as high/late as possible. The query in update (1) probably performs better because we are aggregating in the outer query, no longer in the inner query.
The same seems to apply if we rewrite the LEFT JOIN LATERAL
above as following:
select film.film_id, film.title, array_agg(a.first_name) as actors
from film
left join lateral
(
select actor.first_name
from actor
inner join film_actor using(actor_id)
where film_actor.film_id = film.film_id
) as a
on true
where film.title = 'ACADEMY DINOSAUR'
group by film.film_id
order by film.title;
GroupAggregate (cost=29.44..29.49 rows=1 width=51) (actual time=0.088..0.088 rows=1 loops=1)
Group Key: film.film_id
-> Sort (cost=29.44..29.45 rows=5 width=25) (actual time=0.076..0.077 rows=10 loops=1)
Sort Key: film.film_id
Sort Method: quicksort Memory: 25kB
-> Nested Loop Left Join (cost=4.74..29.38 rows=5 width=25) (actual time=0.031..0.066 rows=10 loops=1)
-> Index Scan using idx_title on film (cost=0.28..8.29 rows=1 width=19) (actual time=0.010..0.010 rows=1 loops=1)
Index Cond: ((title)::text = 'ACADEMY DINOSAUR'::text)
-> Nested Loop (cost=4.47..19.09 rows=200 width=8) (actual time=0.019..0.052 rows=10 loops=1)
-> Bitmap Heap Scan on film_actor (cost=4.32..18.26 rows=5 width=4) (actual time=0.013..0.024 rows=10 loops=1)
Recheck Cond: (film_id = film.film_id)
Heap Blocks: exact=10
-> Bitmap Index Scan on idx_fk_film_id (cost=0.00..4.32 rows=5 width=0) (actual time=0.007..0.007 rows=10 loops=1)
Index Cond: (film_id = film.film_id)
-> Index Scan using actor_pkey on actor (cost=0.14..0.17 rows=1 width=10) (actual time=0.002..0.002 rows=1 loops=10)
Index Cond: (actor_id = film_actor.actor_id)
Planning time: 0.440 ms
Execution time: 0.136 ms
Here, we moved array_agg()
upwards. As you can see, this plan is also better than the original LEFT JOIN LATERAL
.
That said, I am not sure if this self-invented rule of thumb (apply the aggregate function as high/late as possible) is true in other cases.
additional information
Fiddle: https://dbfiddle.uk/?rdbms=postgres_10&fiddle=4ec4f2fffd969d9e4b949bb2ca765ffb
Version: PostgreSQL 10.4 on x86_64-pc-linux-musl, compiled by gcc (Alpine 6.4.0) 6.4.0, 64-bit
Environment: Docker: docker run -e POSTGRES_PASSWORD=sakila -p 5432:5432 -d frantiseks/postgres-sakila
. Please note that the image on Docker hub is outdated, so I did a build locally first: build -t frantiseks/postgres-sakila
after cloning the git repository.
Table definitions:
film
film_id | integer | not null default nextval('film_film_id_seq'::regclass)
title | character varying(255) | not null
Indexes:
"film_pkey" PRIMARY KEY, btree (film_id)
"idx_title" btree (title)
Referenced by:
TABLE "film_actor" CONSTRAINT "film_actor_film_id_fkey" FOREIGN KEY (film_id) REFERENCES film(film_id) ON UPDATE CASCADE ON DELETE RESTRICT
actor
actor_id | integer | not null default nextval('actor_actor_id_seq'::regclass)
first_name | character varying(45) | not null
Indexes:
"actor_pkey" PRIMARY KEY, btree (actor_id)
Referenced by:
TABLE "film_actor" CONSTRAINT "film_actor_actor_id_fkey" FOREIGN KEY (actor_id) REFERENCES actor(actor_id) ON UPDATE CASCADE ON DELETE RESTRICT
film_actor
actor_id | smallint | not null
film_id | smallint | not null
Indexes:
"film_actor_pkey" PRIMARY KEY, btree (actor_id, film_id)
"idx_fk_film_id" btree (film_id)
Foreign-key constraints:
"film_actor_actor_id_fkey" FOREIGN KEY (actor_id) REFERENCES actor(actor_id) ON UPDATE CASCADE ON DELETE RESTRICT
"film_actor_film_id_fkey" FOREIGN KEY (film_id) REFERENCES film(film_id) ON UPDATE CASCADE ON DELETE RESTRICT
Data: this is from the Sakila sample database. This question is not a real-life case, I am using this database mostly as a learning sample database. I have been introduced to SQL some months ago and I am trying to expand my knowledge. It has the following distributions:
select count(*) from film: 1000
select count(*) from actor: 200
select avg(a) from (select film_id, count(actor_id) a from film_actor group by film_id) a: 5.47
Best Answer
Test setup
Your original setup in the fiddle leaves room for improvement. I kept asking for your setup for a reason.
You have these indexes on
film_actor
:Which is pretty helpful already. But to best support your particular query, you would have a multicolumn index on
(film_id, actor_id)
, columns in this order. A practical solution: replaceidx_fk_film_id
with an index on(film_id, actor_id)
- or create the PK on(film_id, actor_id)
for the purpose of this test, like I do below. See:In a read-only (or mostly, or generally when VACUUM can keep up with write activity) it also helps to have an index on
(title, film_id)
to allow index only-scans. My test case is now highly optimized for read performance.Type mismatch between
film.film_id
(integer
) andfilm_actor.film_id
(smallint
). While that works it makes queries slower and can lead to various complications. Also makes FK constraints more expensive. Never do this if it can be avoided. If you are not sure, pickinteger
oversmallint
. Whilesmallint
can save 2 bytes per field (often consumed by alignment padding) there are more complication than withinteger
.To optimize the performance of the test itself, create indexes and constraints after bulk-inserting lots of rows. It is substantially slower to add tuples incrementally to existing indexes than to create them from scratch with all rows present.
Unrelated to this test:
Free-standing sequences plus column defaults instead of much simpler and more reliable
serial
(orIDENTITY
) columns. Don't.timestamp without timestamp
is typically unreliable for a column likelast_update
. Usetimestamptz
instead. And note that column defaults do not cover the "last update", strictly speaking.The length modifier in
character varying(255)
indicates that the test case is not intended for Postgres to begin with because the odd length is pretty pointless here. (Or the author is clueless.)Consider the audited test case in the fiddle:
db<>fiddle here - building on your fiddle, optimized and with added queries.
Related:
A test setup with a 1000 films and 200 actors has limited validity. The most efficient queries take < 0.2 ms. Planning time is more than execution time. A test with 100k or more rows would be more revealing.
Why retrieve only first names of authors? Once you retrieve multiple columns, you already have a slightly different situation.
ORDER BY title
makes no sense while filtering for a single title withWHERE title = 'ACADEMY DINOSAUR'
. MaybeORDER BY film_id
?And for total runtime rather use
EXPLAIN (ANALYZE, TIMING OFF)
to reduce (potentially misleading) noise with sub-timing overhead.Answer
It's hard to form a simple rule of thumb, because total performance depends on many factors. Very basic guidelines:
Aggregating all rows in sub-tables carries less overhead but only pays when you actually need all rows (or a very large part).
For selecting few rows (your test!), different query techniques yield better results. That's where
LATERAL
comes in. It carries more overhead but only reads required rows from sub-tables. A big win if only a (very) small fraction is needed.For your particular test case, I would also test an ARRAY constructor in the
LATERAL
subquery:While only aggregating a single array in the lateral subquery, a simple ARRAY constructor performs better than the aggregate function
array_agg()
. See:Or with a lowly correlated subquery for the simple case:
Or, very basically, just 2x
LEFT JOIN
and then aggregate:These three seem fastest in my updated fiddle (planning + execution time).
Your first attempt (only slightly modified) is typically fastest to retrieve all or most films, but not for a small selection:
Tests with much bigger cardinalities will be more revealing. And don't generalise results lightly, there are many factors for total performance.