PostgreSQL – Troubleshooting Foreign Data Wrappers Join Behavior

execution-planjoin;optimizationpostgresqlpostgresql-fdw

So I have tables in two different databases that I need to analyze together. I am using fdw to do so. However I am running in to weird behavior with my joins.

Keep in mind the following snippets are just accessing the foreign database.

When I do:

SELECT *
FROM tds tr
INNER JOIN dq_infos dq ON tr.id = dq.id
WHERE tr.created_time >= '2020-04-22'

it runs very fast, however the result of this join will have duplicated columns so I can't really do anything with it.

When instead do:

SELECT tr.id
FROM tds tr
INNER JOIN dq_infos dq ON tr.id = dq.id
WHERE tr.created_time >= '2020-04-22'

it takes roughly 10 times as long.

I thought this can't be right so I looked at the query plan and it makes even less sense.

The query plan for the first query is:

Foreign Scan  (cost=100.00..177.23 rows=52 width=908) (actual time=398.276..30161.096 rows=449408 loops=1)
  Output: *
  Relations: (public.tds tr) INNER JOIN (public.dq_infos dq)
  Remote SQL: SELECT * FROM (public.tds r1 INNER JOIN public.dq_infos r2 ON (((r1.id = r2.id)) AND ((r1.created_time >= '2020-04-22 00:00:00'::timestamp without time zone))))
Planning Time: 0.605 ms
Execution Time: 30180.511 ms

(I replaced the full column list with * in this execution plan to make it easier to read)

And the query plan for the second query is:

Merge Join  (cost=580.61..799.37 rows=14259 width=4) (actual time=195048.232..195469.693 rows=449617 loops=1)
  Output: tr.id
  Merge Cond: (tr.id = dq.id)
  Buffers: shared hit=4, temp read=176695 written=178369
  ->  Sort  (cost=214.47..216.91 rows=975 width=4) (actual time=4195.650..4355.824 rows=449618 loops=1)
    Output: tr.id
    Sort Key: tr.id
    Sort Method: external merge  Disk: 6552kB
    Buffers: shared hit=4, temp read=2283 written=2375
    ->  Foreign Scan on public.tds tr  (cost=100.00..166.06 rows=975 width=4) (actual time=412.259..3815.536 rows=475864 loops=1)
      Output: tr.id
      Remote SQL: SELECT id FROM public.tds WHERE ((created_time >= '2020-04-22 00:00:00'::timestamp without time zone))
  ->  Sort  (cost=366.15..373.46 rows=2925 width=4) (actual time=188117.092..190081.599 rows=12577570 loops=1)
    Output: dq.id
    Sort Key: dq.id
    Sort Method: external sort  Disk: 221568kB
    Buffers: temp read=174412 written=175994
    ->  Foreign Scan on public.dq_infos dq  (cost=100.00..197.75 rows=2925 width=4) (actual time=1.041..33499.615 rows=12577570 loops=1)
      Output: dq.id
      Remote SQL: SELECT id FROM public.dq_infos
Planning Time: 0.311 ms
Execution Time: 195648.893 ms

This is clearly the reason for it being slower though I can't figure out why it would do this.

Best Answer

You never ANALYZEd the foreign tables, so the estimates are way off, and PostgreSQL chooses a bad execution plan.

ANALYZE tds, dq_infos;

should improve the situation.

Different from regular tables, foreign tables are not handled by autoanalyze.