Postgresql – Postgres JOIN strange behaviour

indexjoin;performancepostgresqlpostgresql-performance

I'm just starting studying Postgres and I'm in a situation where, depending how I do the JOIN on my tables, the performance and plan output seems really strange.

Those are the table used with their indexes:

create table escola
(
    pk_codigo integer not null
        constraint pk_escola
            primary key,
    nome varchar(100),
    municipio varchar(150),
    uf char(2),
    cod_municipio integer,
    uf_id integer default 0 not null
        constraint fk_escola_uf_id
            references tb_uf
)
;

create index idx_escola_uf
    on escola (uf)
;

create index idx_escola_uf_id
    on escola (uf_id)
;

create index idx_multi_escola_uf_pk
    on escola (uf, pk_codigo)
;

create table if not exists candidato
(
    pk_numero_inscricao bigint not null
        constraint candidato_pk
            primary key,
    cod_municipio_residencia integer,
    municipio_residencia varchar(150),
    uf_residencia char(2),
    uf_nascimento char(2),
    situacao_conclusao numeric(1),
    ano_concluiu smallint,
    idade smallint,
    sexo char,
    fk_codigo_escola integer
        constraint fk_candidato_codigo_escola
            references escola,
    uf_prova char(2)
)
;

create index if not exists idx_candidato_codigo_escola
    on candidato (fk_codigo_escola)
;

create table tb_uf
(
    uf varchar(2),
    pk_id serial not null
        constraint tb_uf_pkey
            primary key
)
;

create unique index tb_uf_uf_uindex
    on tb_uf (uf)
;

create unique index tb_uf_pk_id_uindex
    on tb_uf (pk_id)
;

And the queries (with plans):

EXPLAIN ANALYZE
SELECT pk_numero_inscricao, pk_codigo
FROM escola e
  JOIN candidato c
    ON c.fk_codigo_escola = e.pk_codigo
WHERE e.uf = 'RJ'
;

Time without EXPLAIN ANALYZE: 916ms
Plan: https://explain.depesz.com/s/M6B

EXPLAIN ANALYZE
SELECT pk_numero_inscricao, pk_codigo
FROM escola AS e
  JOIN candidato AS c
    ON c.fk_codigo_escola = e.pk_codigo
  JOIN tb_uf AS u
    ON e.uf_id = u.pk_id
WHERE u.uf = 'RJ'
;

Time without EXPLAIN ANALYZE: 72ms
Plan: https://explain.depesz.com/s/E3MR

EXPLAIN ANALYZE
SELECT pk_numero_inscricao, pk_codigo
FROM escola AS e
  JOIN candidato AS c
    ON c.fk_codigo_escola = e.pk_codigo
WHERE e.uf_id = 19
;

Time without EXPLAIN ANALYZE: 961ms
Plan: https://explain.depesz.com/s/v67V

The weird thing happening for me is that queries 1 and 3 are slower than query 2, although query 2 has an extra join. Does anybody know what might be causing this?

I noticed that Index Scan on table candidato is a lot slower on queries 1 and 3 also, and that makes no sense for me, since the final result is the same.

Another point is that EXPLAIN ANALYZE is adding a lot of overhead into the queries.

Thanks in advance! If I need to provide any more information, I can edit this post if needed!

Best Answer

The answer is pretty "searchable" when you notice, that 1st and 3rd query fetch 1.2M rows from candidato ...just to exclude 90% of them from the results.

2nd query returns 1 row from tb_uf, which forces Nested Loop plan.

This means that planner has wrong assumptions about statistics (expected results count) or costs (of random seek). You could either tune these values:

https://www.postgresql.org/message-id/20060926193553.GA27268@oppetid.no

Have Postgresql query planner use nested loop w/ indices over hash join

or manually force a Nested Loop. Although my intuition tells me it should be better to have Hash Join here.

Not having the data, I'd suggest to try one of:

  1. set enable_mergejoin = off
  2. using CTE to force order of operations, like (3rd query with minimal modifications for easier understanding):
 WITH e AS (SELECT * FROM escola WHERE uf_id = 19)
 SELECT pk_numero_inscricao, pk_codigo
 FROM e
   JOIN candidato AS c
     ON c.fk_codigo_escola = e.pk_codigo