PostgreSQL: Slow JOIN between multiple tables.

performancepostgresqlquery-performance

Context:

I am working on a data warehouse that needs to have responsive look ups as it is used as the source for an operator console. The target is to achieve sub 1 Second lookups (ideal would be 100ms).

The size of the data is expected to increase into the millions, but this can be limited as older data is not as relevant.

The tables:

Tables either contain data as an append only log, or act as an identifier as to which records are the "latest" for each resource.

First some information on the relevant table:

activity: has 3.5 million rows, 34 columns (append only log)

activitylatest: has 1.1 million rows, 4 columns (identifier for activity)

round: 3.5 million rows, 17 columns (append only log)

roundlatest: 0.7 million rows, 4 columns (identifier for round)

Indexes are available on all columns (except for large texts) since users may filter with any fields and "updateId" is the primary key for data tables.

The Query

I'm trying to get the data within activity depending on the user's filtering which may include a single column from round table.

select
activity.update_id,
...
activity.activity_id,
round.round_external_ref
from activity
join round
    on activity.round_id = round.round_id
join roundlatest
    on round.update_id = roundlatest.update_id
join activitylatest
    on activity.update_id = activitylatest.update_id
where activity.community_id = 1
order by activity.update_id desc
limit 1001

Executing this query will deliver the following plan (EXPLAIN ANALYZE):

    Limit  (cost=1.71..12367.37 rows=1001 width=2056) (actual time=2324.231..2341.835 rows=1001 loops=1)
  ->  Nested Loop  (cost=1.71..14219657.65 rows=1151081 width=2056) (actual time=2324.231..2341.721 rows=1001 loops=1)
        ->  Nested Loop  (cost=1.29..6270077.60 rows=5932256 width=2064) (actual time=2324.204..2333.877 rows=5158 loops=1)
              ->  Nested Loop  (cost=0.86..3654699.78 rows=1187917 width=2019) (actual time=0.024..1765.586 rows=390452 loops=1)
                    ->  Index Only Scan Backward using "IDX_datawarehouse_gameActivitylatest_update_id" on datawarehouse_gameactivitylatest  (cost=0.43..60604.50 rows=1199068 width=8) (actual time=0.015..165.819 rows=390452 loops=1)
                          Heap Fetches: 239183
                    ->  Index Scan using "PK_gameActivity_update_id" on datawarehouse_gameactivity  (cost=0.43..2.99 rows=1 width=2019) (actual time=0.002..0.003 rows=1 loops=390452)
                          Index Cond: (update_id = datawarehouse_gameactivitylatest.update_id)
                          Filter: (community_id = 1)
              ->  Index Scan using "IDX_datawarehouse_gameRound_round_id" on datawarehouse_gameround  (cost=0.43..2.15 rows=5 width=53) (actual time=0.001..0.001 rows=0 loops=390452)
                    Index Cond: (round_id = datawarehouse_gameactivity.round_id)
        ->  Index Only Scan using "IDX_datawarehouse_gameRoundlatest_update_id" on datawarehouse_gameroundlatest  (cost=0.42..1.33 rows=1 width=8) (actual time=0.001..0.001 rows=0 loops=5158)
              Index Cond: (update_id = datawarehouse_gameround.update_id)
              Heap Fetches: 1067
Planning time: 1.153 ms
Execution time: 2341.989 ms

The first time running the query it runs for over 30 seconds (up to 60 seconds). Successive runs seem to maintain approx 5 seconds durations. I assume this is due to system caching, I do need to fix both the first and successive queries to sub 1 second.

During my testing I noticed that simply switching ordering from DESC to ASC improves the query performance to approx 2.5 seconds with merge join being used. This has left me puzzled and I'm not sure as to the cause and I do need DESC.

ASC Explain Analyze:

Limit  (cost=184.88..10004.46 rows=1001 width=2056) (actual time=0.064..23.341 rows=1001 loops=1)
  ->  Nested Loop  (cost=184.88..11301988.03 rows=1152097 width=2056) (actual time=0.063..23.265 rows=1001 loops=1)
        ->  Nested Loop  (cost=184.46..3332810.27 rows=5940024 width=2064) (actual time=0.054..14.535 rows=5227 loops=1)
              ->  Merge Join  (cost=184.03..710946.78 rows=1189471 width=2019) (actual time=0.046..7.915 rows=1001 loops=1)
                    Merge Cond: (datawarehouse_gameactivity.update_id = datawarehouse_gameactivitylatest.update_id)
                    ->  Index Scan using "PK_gameActivity_update_id" on datawarehouse_gameactivity  (cost=0.43..630047.01 rows=3522681 width=2019) (actual time=0.010..5.513 rows=3007 loops=1)
                          Filter: (community_id = 1)
                    ->  Index Only Scan using "IDX_datawarehouse_gameActivitylatest_update_id" on datawarehouse_gameactivitylatest  (cost=0.43..60662.53 rows=1200637 width=8) (actual time=0.007..0.204 rows=1001 loops=1)
                          Heap Fetches: 0
              ->  Index Scan using "IDX_datawarehouse_gameRound_round_id" on datawarehouse_gameround  (cost=0.43..2.15 rows=5 width=53) (actual time=0.002..0.005 rows=5 loops=1001)
                    Index Cond: (round_id = datawarehouse_gameactivity.round_id)
        ->  Index Only Scan using "IDX_datawarehouse_gameRoundlatest_update_id" on datawarehouse_gameroundlatest  (cost=0.42..1.33 rows=1 width=8) (actual time=0.001..0.001 rows=0 loops=5227)
              Index Cond: (update_id = datawarehouse_gameround.update_id)
              Heap Fetches: 88
Planning time: 1.124 ms
Execution time: 23.475 ms

(Explain analyse says 23 ms, but actually running the query takes 2.5 seconds according to pgadmin)

We are using PostgreSQL 9.6 on SSD cloud based VMs.

Best Answer

A composite index might be optimal here:

create index on activity (community_id, update_id);

The reason the direction of the ORDER BY matters (probably) is that rows with community_id=1 preferentially occur at the beginning of the update_id values, so if you walk up that index until you accumulate 1001 rows with community_id=1, it is much faster than if you walk down it until you accumulate 1001 which have community_id=1. Using the composite index can get rid of that problem as you can jump right to what you want rather than needing to filter them out later.

Regarding your doubts about how useful the single-column indexes are, it will pick the single-column index that it thinks is most selective. So it will be better than nothing, but not as good as the perfect multi-column index. It can also combine multiple single-column indexes together using bitmap scans. Again that is not as good as the perfect multi-column index, but it could easily be good enough. If you always ORDER BY update_id, then you can productively turn every single-column index into a two-column composite index ending in update_id, which maybe will not add all that much overhead as the number of indexes stays the same. Unfortunately you can't get both the bitmap scan and the index-supported ORDER BY at the same time, so it will have to choose between them.