So I have tables in two different databases that I need to analyze together. I am using fdw to do so. However I am running in to weird behavior with my joins.
Keep in mind the following snippets are just accessing the foreign database.
When I do:
SELECT *
FROM tds tr
INNER JOIN dq_infos dq ON tr.id = dq.id
WHERE tr.created_time >= '2020-04-22'
it runs very fast, however the result of this join will have duplicated columns so I can't really do anything with it.
When instead do:
SELECT tr.id
FROM tds tr
INNER JOIN dq_infos dq ON tr.id = dq.id
WHERE tr.created_time >= '2020-04-22'
it takes roughly 10 times as long.
I thought this can't be right so I looked at the query plan and it makes even less sense.
The query plan for the first query is:
Foreign Scan (cost=100.00..177.23 rows=52 width=908) (actual time=398.276..30161.096 rows=449408 loops=1)
Output: *
Relations: (public.tds tr) INNER JOIN (public.dq_infos dq)
Remote SQL: SELECT * FROM (public.tds r1 INNER JOIN public.dq_infos r2 ON (((r1.id = r2.id)) AND ((r1.created_time >= '2020-04-22 00:00:00'::timestamp without time zone))))
Planning Time: 0.605 ms
Execution Time: 30180.511 ms
(I replaced the full column list with * in this execution plan to make it easier to read)
And the query plan for the second query is:
Merge Join (cost=580.61..799.37 rows=14259 width=4) (actual time=195048.232..195469.693 rows=449617 loops=1)
Output: tr.id
Merge Cond: (tr.id = dq.id)
Buffers: shared hit=4, temp read=176695 written=178369
-> Sort (cost=214.47..216.91 rows=975 width=4) (actual time=4195.650..4355.824 rows=449618 loops=1)
Output: tr.id
Sort Key: tr.id
Sort Method: external merge Disk: 6552kB
Buffers: shared hit=4, temp read=2283 written=2375
-> Foreign Scan on public.tds tr (cost=100.00..166.06 rows=975 width=4) (actual time=412.259..3815.536 rows=475864 loops=1)
Output: tr.id
Remote SQL: SELECT id FROM public.tds WHERE ((created_time >= '2020-04-22 00:00:00'::timestamp without time zone))
-> Sort (cost=366.15..373.46 rows=2925 width=4) (actual time=188117.092..190081.599 rows=12577570 loops=1)
Output: dq.id
Sort Key: dq.id
Sort Method: external sort Disk: 221568kB
Buffers: temp read=174412 written=175994
-> Foreign Scan on public.dq_infos dq (cost=100.00..197.75 rows=2925 width=4) (actual time=1.041..33499.615 rows=12577570 loops=1)
Output: dq.id
Remote SQL: SELECT id FROM public.dq_infos
Planning Time: 0.311 ms
Execution Time: 195648.893 ms
This is clearly the reason for it being slower though I can't figure out why it would do this.
Best Answer
You never
ANALYZE
d the foreign tables, so the estimates are way off, and PostgreSQL chooses a bad execution plan.should improve the situation.
Different from regular tables, foreign tables are not handled by autoanalyze.