Postgresql – Increasing join performance on very large table

join;partitioningpostgresql

I have the following query in Postgres database

 SELECT
        (rel."external-institute" || '#' || rel."external-institute-code")::varchar(50)
        FROM public."td_documents" d
        JOIN request_docs rd
            ON d."type-id" = rd."type-id"
                AND d."series-md5" = rd."series"
                AND d."number-md5" = rd."number"
                AND d."issue-date-string-md5" = rd."issue-date"
        JOIN public."tl_relationships" tlr
            ON tlr."entity-id" = d."id"
                AND tlr."entity" IN (50, 51, 59)
        JOIN public."td_relationships" rel
            ON rel."id" = tlr."relationship-id"
                AND rel."relationship-type" = 30

The table request_docs contains series and numbers of documents to find and is essentially an input parameter. In contains between 100 and 1000 rows.
On the other hand, other tables contain millions of rows, and tl_relationships table about 1.7 billion rows.
Table td_documents has the following indexes:

CREATE UNIQUE INDEX "UNQ_td_documents_number-md5_series-md5_type-id" ON public.td_documents USING btree ("number-md5", "series-md5", "type-id", "issue-date-string-md5")

CREATE UNIQUE INDEX pk_td_documents ON public.td_documents USING btree (id)

Table tl_relationships has indexes:

CREATE INDEX "IDX_tl_relationships-entity-entity-id" ON public.tl_relationships USING btree (entity, "entity-id")
CREATE INDEX "IDX_tl_relationships_relationship-id" ON public.tl_relationships USING btree ("relationship-id")
CREATE UNIQUE INDEX pk_tl_relationships ON public.tl_relationships USING btree (id)

Table td_relationships has indexes:

CREATE INDEX "IDX_td_relationships_external-institute-code" ON public.td_relationships USING btree ("external-institute", "external-institute-code")
CREATE INDEX "IDX_td_relationships_id" ON public.td_relationships USING btree (id)
CREATE UNIQUE INDEX idx_td_relationships_and_type ON public.td_relationships USING btree (id, "relationship-type")
CREATE UNIQUE INDEX pk_td_relationships ON public.td_relationships USING btree (id)

The execution of this query takes between 1 and 120 seconds with the query plan:

QueryPlan

In plain form:
enter image description here

And another plan example in text form:

Nested Loop  (cost=1.84..5812.59 rows=1 width=118) (actual time=53.335..63870.863 rows=334 loops=1)
Buffers: shared hit=18934 read=7096, local hit=5
I/O Timings: read=63669.907
->  Nested Loop  (cost=1.27..5778.91 rows=5 width=4) (actual time=27.046..38184.671 rows=3536 loops=1)
    Buffers: shared hit=7292 read=4251, local hit=5
    I/O Timings: read=38081.637
    ->  Nested Loop  (cost=0.69..5508.33 rows=1 width=4) (actual time=11.807..10750.304 rows=431 loops=1)
          Buffers: shared hit=1991 read=825, local hit=5
          I/O Timings: read=10725.009
          ->  Seq Scan on request_docs rd  (cost=0.00..16.30 rows=630 width=100) (actual time=0.009..1.189 rows=476 loops=1)
                Buffers: local hit=5
          ->  Index Scan using "UNQ_td_documents_number-md5_series-md5_type-id" on td_documents d  (cost=0.69..8.72 rows=1 width=59) (actual time=22.571..22.571 rows=1 loops=476)
                Index Cond: (("number-md5" = rd.number) AND ("series-md5" = rd.series) AND ("type-id" = rd."type-id") AND ("issue-date-string-md5" = rd."issue-date"))
                Buffers: shared hit=1991 read=825
                I/O Timings: read=10725.009
    ->  Index Scan using "IDX_tl_relationships-entity-entity-id" on tl_relationships tlr  (cost=0.58..269.79 rows=80 width=8) (actual time=15.067..63.623 rows=8 loops=431)
          Index Cond: ((entity = ANY ('{50,51,59}'::integer[])) AND ("entity-id" = d.id))
          Buffers: shared hit=5301 read=3426
          I/O Timings: read=27356.627
->  Index Scan using idx_td_relationships_and_type on td_relationships rel  (cost=0.57..6.73 rows=1 width=17) (actual time=7.258..7.258 rows=0 loops=3536)
    Index Cond: ((id = tlr."relationship-id") AND ("relationship-type" = 30))
    Buffers: shared hit=11642 read=2845
    I/O Timings: read=25588.270
Planning Time: 0.497 ms
Execution Time: 63871.338 ms

The question is: is it possible to decrease execution time on larger input without normalizing or partitioning some of the tables?

Thank you in advance for any advice!

Upd:
Added execution plan in image form.

Upd:
Added execution plan in text form with buffers option).

Best Answer

Your IO performance here seems pretty horrible. Is this a RAID made up of laptop hard drives? Any possibility you could just get better hardware? Or enough RAM to cache large parts of the data? How much RAM do you have and how big are the tables and indexes?

->  Index Scan using "UNQ_td_documents_number-md5_series-md5_type-id" on td_documents d  (cost=0.69..8.72 rows=1 width=59) (actual time=22.571..22.571 rows=1 loops=476).

While this is not the worst absolute time sink, 22 ms per row seems particularly egregious. One thing you could do is add an index on public.td_documents USING btree ("number-md5", "series-md5", "type-id", "issue-date-string-md5",id), to replace the index "UNQ_td_documents_number-md5_series-md5_type-id". That way it can get "id" directly from the index without having to visit the table, halving the number of disk reads.

->  Index Scan using idx_td_relationships_and_type on td_relationships rel  (cost=0.57..6.73 rows=1 width=17) (actual time=7.258..7.258 rows=0 loops=3536)
    Index Cond: ((id = tlr."relationship-id") AND ("relationship-type" = 30))
    Buffers: shared hit=11642 read=2845
    I/O Timings: read=25588.270

Reversing the order of the columns in the index might help here. Since "relationship-type" is always tested against the same value, putting it first means that that part of the index will be denser in rows of interest to you, leading to better caching. And if you could add "external-institute" and "external-institute-code" to the end of that index, then you could get another index-only-scan. Is "relationship-type" always tested against 30, or does that value change from run to run? If always 30, than rather than leading the index with that column, and can remove and add WHERE "relationship-type"=30 at the end to make a partial index.

Perhaps you could want it to do hash joins rather than nested loops, but at 1.7 billion rows you will probably not want to full-scan this thing, either. How long does it take to do a simple select count(*) from td_relationships?

Another avenue is to just start increasing 'random_page_cost' until the plan changes, and seeing what the new plan is and whether it is better or worse. You will have to do something, like dropping the caches, or changing the parameters each time, to prevent caching from spoiling the test.