Postgresql – Optimize a slow running Postgres SQL query

performancepostgresqlpostgresql-performance

I have a SQL optimization question.

Raw SQL

SELECT releases.id AS release_id,
releases.name AS release_name,
releases.label_id AS release_label_id,
releases.provider_id AS release_provider_id,
tracks.vendor_id AS track_vendor_id,
tracks.id AS track_id, artists.display_name AS artist_name,
artists.id AS artist_id
FROM "artists"
INNER JOIN artist_relations ON artist_relations.artist_id = artists.id AND artist_relations.artist_item_type = 'Release'
INNER JOIN releases ON releases.id = artist_relations.artist_item_id
INNER JOIN release_tracks ON release_tracks.release_id = releases.id
INNER JOIN  tracks ON tracks.id = release_tracks.track_id
WHERE "tracks"."vendor" IN ($1 ... $1000) AND (tracks.provider_id LIKE 'vendor:xxxx:%')

Using postgres EXPLAIN command, i have following metrics

Nested Loop  (cost=2.15..13582.98 rows=840 width=103)
->  Nested Loop  (cost=1.72..13171.34 rows=840 width=89)
Join Filter: (releases.id = artist_relations.artist_item_id)
->  Nested Loop  (cost=1.29..12735.96 rows=762 width=89)
->  Nested Loop  (cost=0.86..12332.99 rows=762 width=23)
->  Index Scan using index_tracks_on_vendor_id on tracks  (cost=0.43..6757.77 rows=633 width=15)
Index Cond: ((vendor_id)::text = ANY ('{1207836,3860107,4595748,8319459 ....... }'::text[]))
Filter: ((provider_id)::text ~~ 'vendor:itunes:%'::text)
->  Index Scan using idx_36553_index_release_tracks_on_track_id on release_tracks  (cost=0.43..8.78 rows=3 width=16)
Index Cond: (track_id = tracks.id)
->  Index Scan using idx_36527_primary on releases  (cost=0.43..0.53 rows=1 width=66)
Index Cond: (id = release_tracks.release_id)
->  Index Scan using index_artist_relations_on_artist_item_type_and_artist_item_id on artist_relations  (cost=0.43..0.56 rows=1 width=16)
Index Cond: (((artist_item_type)::text = 'Release'::text) AND (artist_item_id = release_tracks.release_id))
->  Index Scan using idx_35952_primary on artists  (cost=0.43..0.49 rows=1 width=22)
Index Cond: (id = artist_relations.artist_id)
(16 rows)

OUTPUT of EXPLAIN (ANALYZE,BUFFERS)

[
  ["Nested Loop  (cost=2.15..13582.98 rows=840 width=103) (actual time=22.672..25343.907 rows=1796 loops=1)"],
  ["  Buffers: shared hit=19574 read=3721"],
  ["  ->  Nested Loop  (cost=1.72..13171.34 rows=840 width=89) (actual time=12.598..23709.124 rows=1796 loops=1)"],
  ["        Join Filter: (releases.id = artist_relations.artist_item_id)"],
  ["        Buffers: shared hit=12787 read=3323"],
  ["        ->  Nested Loop  (cost=1.29..12735.96 rows=762 width=89) (actual time=4.201..16740.676 rows=999 loops=1)"],
  ["              Buffers: shared hit=9392 read=2342"],
  ["              ->  Nested Loop  (cost=0.86..12332.99 rows=762 width=23) (actual time=2.128..11460.000 rows=999 loops=1)"],
  ["                    Buffers: shared hit=6136 read=1599"],
  ["                    ->  Index Scan using index_tracks_on_vendor_id on tracks  (cost=0.43..6757.77 rows=633 width=15) (actual time=2.115..5288.257 rows=999 loops=1)"],
  ["                          Index Cond: ((vendor_id)::text = ANY ('{1207836...3860107}'::text[]))"],
  ["                          Filter: ((provider_id)::text ~~ 'vendor:itunes:%'::text)"],
  ["                          Buffers: shared hit=2934 read=799"],
  ["                    ->  Index Scan using idx_36553_index_release_tracks_on_track_id on release_tracks  (cost=0.43..8.78 rows=3 width=16) (actual time=6.164..6.175 rows=1 loops=999)"],
  ["                          Index Cond: (track_id = tracks.id)"],
  ["                          Buffers: shared hit=3202 read=800"],
  ["              ->  Index Scan using idx_36527_primary on releases  (cost=0.43..0.53 rows=1 width=66) (actual time=5.283..5.283 rows=1 loops=999)"],
  ["                    Index Cond: (id = release_tracks.release_id)"],
  ["                    Buffers: shared hit=3256 read=743"],
  ["        ->  Index Scan using index_artist_relations_on_artist_item_type_and_artist_item_id on artist_relations  (cost=0.43..0.56 rows=1 width=16) (actual time=6.436..6.969 rows=2 loops=999)"],
  ["              Index Cond: (((artist_item_type)::text = 'Release'::text) AND (artist_item_id = release_tracks.release_id))"],
  ["              Buffers: shared hit=3395 read=981"],
  ["  ->  Index Scan using idx_35952_primary on artists  (cost=0.43..0.49 rows=1 width=22) (actual time=0.908..0.908 rows=1 loops=1796)"],
  ["        Index Cond: (id = artist_relations.artist_id)"],
  ["        Buffers: shared hit=6787 read=398"],
  ["Planning time: 2.248 ms"],
  ["Execution time: 25344.613 ms"]
]

Questions

  1. How can I change my query to avoid Nested Loop?
  2. Index Scan using index_tracks_on_vendor_id on tracks (cost=0.43..6757.77 rows=633 width=15) seems rather huge. Any suggestions to bring it down?

Best Answer

If the condition tracks.provider_id LIKE 'vendor:xxxx:%' is selective, you could consider adding an index for it:

CREATE INDEX ON tracks ((provider_id::text) text_pattern_ops);

But that will at best buy you a few seconds — the other joins together contribute most of the execution time.

I don't think avoiding a nested loop join will improve execution time, but you can try to disable it temporarily with

SET enable_nestloop = off;

to see if you actually gain anything that way.

There is probably no much you can do except having more RAM so that all the tables are cached.

Another idea is to include the columns in the SELECT list at the end of the respective indexes in the hope you get a faster index only scan.