PostgreSQL Performance – Improve Query Performance with Many Joins

indexindex-tuningperformancepostgresqlpostgresql-performance

From the following explain plan:

explain SELECT imei, cliente_avl_equipo.id_cliente AS idcliente,
               cliente_avl_vehiculo.grupo AS grupo, avl_vehiculo.id,
               avl_vehiculo.id_clase, avl_vehiculo_clase.icono AS icono,
               avl_vehiculo.descripcion, avl_vehiculo.patente,
               avl_vehiculo.info, avl_vehiculo.conductor,
               avl_vehiculo.fijo_chasis, avl_vehiculo.enabled
        FROM   cliente_avl_vehiculo,avl_vehiculo, avl_vehiculo_equipo,
               cliente_avl_equipo, avl_vehiculo_clase
        WHERE  avl_vehiculo.id                 =avl_vehiculo_equipo.id
        AND    cliente_avl_vehiculo.vehiculo_id=avl_vehiculo.id
        AND    cliente_avl_equipo.imei_equipo  =avl_vehiculo_equipo.imei
        AND    avl_vehiculo_clase.id           =avl_vehiculo.id_clase
        AND    avl_vehiculo_equipo.imei        =1234

I get an output like this:

Nested Loop  (cost=0.00..22.57 rows=1 width=125
    ->  Nested Loop  (cost=0.00..22.29 rows=1 width=67)
        ->  Nested Loop  (cost=0.00..14.92 rows=1 width=63)
          ->  Nested Loop  (cost=0.00..14.58 rows=1 width=63)
                ->  Index Scan using fki_avl_vehiculo_equipo_imei on avl_vehiculo_equipo  (cost=0.00..7.28 rows=1 width=12)
                      Index Cond: (imei = 1234)
                ->  Index Scan using index_avl_vehiculo_id on avl_vehiculo  (cost=0.00..7.29 rows=1 width=51)
                      Index Cond: (id = avl_vehiculo_equipo.id)
          ->  Index Scan using index_cliente_avl_vehiculo_id on cliente_avl_vehiculo  (cost=0.00..0.33 rows=1 width=8)
               Index Cond: (vehiculo_id = avl_vehiculo.id)
       ->  Index Scan using fki_cliente_avl_equipo_imei on cliente_avl_equipo  (cost=0.00..7.36 rows=1 width=12)
              Index Cond: (imei_equipo = 1234)
    ->  Index Scan using index_avl_vehiculo_clase_id on avl_vehiculo_clase  (cost=0.00..0.27 rows=1 width=62)
        Index Cond: (id = avl_vehiculo.id_clase)

I'm not too familiar with explain outputs, but the Index Scan lines seem OK to me, but I'm in doubt with the Nested Loop ones, any advise would be appreciated.

Added output of explain(analyze, verbose, buffers) as requested by @a_horse_with_no_name:

Nested Loop  (cost=0.00..22.57 rows=1 width=125) (actual time=0.122..0.122 rows=0 loops=1)
  Output: avl_vehiculo_equipo.imei, cliente_avl_equipo.id_cliente, cliente_avl_vehiculo.grupo, avl_vehiculo.id, avl_vehiculo.id_clase, avl_vehiculo_clase.icono, avl_vehiculo.descripcion, avl_vehiculo.patente, avl_vehiculo.info, avl_vehiculo.conductor, avl_ (...)
  Buffers: shared hit=6
  ->  Nested Loop  (cost=0.00..22.29 rows=1 width=67) (actual time=0.121..0.121 rows=0 loops=1)
        Output: cliente_avl_vehiculo.grupo, avl_vehiculo.id, avl_vehiculo.id_clase, avl_vehiculo.descripcion, avl_vehiculo.patente, avl_vehiculo.info, avl_vehiculo.conductor, avl_vehiculo.fijo_chasis, avl_vehiculo.enabled, avl_vehiculo_equipo.imei, cliente (...)
        Buffers: shared hit=6
        ->  Nested Loop  (cost=0.00..14.92 rows=1 width=63) (actual time=0.120..0.120 rows=0 loops=1)
              Output: cliente_avl_vehiculo.grupo, avl_vehiculo.id, avl_vehiculo.id_clase, avl_vehiculo.descripcion, avl_vehiculo.patente, avl_vehiculo.info, avl_vehiculo.conductor, avl_vehiculo.fijo_chasis, avl_vehiculo.enabled, avl_vehiculo_equipo.imei
              Buffers: shared hit=6
              ->  Nested Loop  (cost=0.00..14.58 rows=1 width=63) (actual time=0.119..0.119 rows=0 loops=1)
                    Output: avl_vehiculo.id, avl_vehiculo.id_clase, avl_vehiculo.descripcion, avl_vehiculo.patente, avl_vehiculo.info, avl_vehiculo.conductor, avl_vehiculo.fijo_chasis, avl_vehiculo.enabled, avl_vehiculo_equipo.imei, avl_vehiculo_equipo.id
                    Buffers: shared hit=6
                    ->  Index Scan using fki_avl_vehiculo_equipo_imei on public.avl_vehiculo_equipo  (cost=0.00..7.28 rows=1 width=12) (actual time=0.118..0.118 rows=0 loops=1)
                          Output: avl_vehiculo_equipo.imei, avl_vehiculo_equipo.id, avl_vehiculo_equipo.movil, avl_vehiculo_equipo.creation, avl_vehiculo_equipo.updated, avl_vehiculo_equipo.power_input, avl_vehiculo_equipo.ign_input, avl_vehiculo_equipo.po (...)
                          Index Cond: (avl_vehiculo_equipo.imei = 1234)
                          Buffers: shared hit=6
                    ->  Index Scan using index_avl_vehiculo_id on public.avl_vehiculo  (cost=0.00..7.29 rows=1 width=51) (never executed)
                          Output: avl_vehiculo.id, avl_vehiculo.id_clase, avl_vehiculo.descripcion, avl_vehiculo.patente, avl_vehiculo.info, avl_vehiculo.numero, avl_vehiculo.conductor, avl_vehiculo.combustible, avl_vehiculo.consumo, avl_vehiculo.marca, av (...)
                          Index Cond: (avl_vehiculo.id = avl_vehiculo_equipo.id)
              ->  Index Scan using index_cliente_avl_vehiculo_id on public.cliente_avl_vehiculo  (cost=0.00..0.33 rows=1 width=8) (never executed)
                    Output: cliente_avl_vehiculo.cliente_rut, cliente_avl_vehiculo.vehiculo_id, cliente_avl_vehiculo.id_cliente, cliente_avl_vehiculo.grupo
                    Index Cond: (cliente_avl_vehiculo.vehiculo_id = avl_vehiculo.id)
        ->  Index Scan using fki_cliente_avl_equipo_imei on public.cliente_avl_equipo  (cost=0.00..7.36 rows=1 width=12) (never executed)
              Output: cliente_avl_equipo.rut_cliente, cliente_avl_equipo.imei_equipo, cliente_avl_equipo.id_cliente
              Index Cond: (cliente_avl_equipo.imei_equipo = 1234)
  ->  Index Scan using index_avl_vehiculo_clase_id on public.avl_vehiculo_clase  (cost=0.00..0.27 rows=1 width=62) (never executed)
        Output: avl_vehiculo_clase.id, avl_vehiculo_clase.descripcion, avl_vehiculo_clase.categoria, avl_vehiculo_clase.icono
        Index Cond: (avl_vehiculo_clase.id = avl_vehiculo.id_clase)
Total runtime: 1.535 ms

Best Answer

Use table aliases, explicit JOIN syntax, trim noise and format the code to make it human-readable:

SELECT ave.imei
     , cae.id_cliente AS idcliente  -- odd alias?
     , cav.grupo
     , av.id
     , av.id_clase
     , avc.icono
     , av.descripcion
     , av.patente
     , av.info
     , av.conductor
     , av.fijo_chasis
     , av.enabled
FROM   avl_vehiculo_equipo  ave
JOIN   cliente_avl_equipo   cae ON cae.imei_equipo = ave.imei
JOIN   avl_vehiculo         av  ON av.id           = ave.id
JOIN   cliente_avl_vehiculo cav ON cav.vehiculo_id = av.id
JOIN   avl_vehiculo_clase   avc ON avc.id          = av.id_clase
WHERE  ave.imei = 1234;

It's now evident that, except for avl_vehiculo, you only need two columns from each table, probably just integer columns (would be ideal).

For top read performance optimized for this query, have this set of indexes:

ave(imei, id)                -- index only scan
cae(imei_equipo, id_cliente) -- index only scan
av (id, id_clase)            -- index scan
cav(vehiculo_id, grupo)      -- index only scan
avc(id, icono)               -- index only scan

Assuming Postgres 9.2 or later, and if some preconditions are met, you will now get index-only scans for all except avl_vehiculo.

I would CLUSTER (or pg_repack if you cannot afford exclusive locks) all involved tables based on these indexes at least once. More:

Sequence of index columns matters!