Postgresql – Query taking +1s, how can I improve performance

postgresql

I have a database server running Postgresql 9.6. Logging querys that take +1s, I got a lot of these:

SELECT 
    "homepage_publicacao"."id", 
    "homepage_publicacao"."oferta_id", 
    "homepage_publicacao"."ordem", 
    "homepage_publicacao"."zerar_contadores",
    "homepage_publicacao"."inicio", "homepage_publicacao"."final" 
FROM "homepage_publicacao" 
    INNER JOIN "homepage_publicacao_cidades" ON 
       ("homepage_publicacao"."id" = "homepage_publicacao_cidades"."publicacao_id") 
    INNER JOIN "homepage_oferta" ON 
       ("homepage_publicacao"."oferta_id" = "homepage_oferta"."id") 
WHERE 
    ("homepage_publicacao_cidades"."cidade_id" = 31  AND 
    "homepage_publicacao"."inicio" <= '2017-06-19 14:34:33.432977'  AND 
    "homepage_oferta"."site_id" = 1  AND 
    "homepage_publicacao"."inicio" <= '2017-06-19 14:34:33.432977'  AND 
    "homepage_oferta"."aprovada" = true  AND 
    "homepage_oferta"."combo" <= 1  AND 
    "homepage_publicacao"."final" >= '2017-06-19 14:34:33.432977' ) 
ORDER BY 
    "homepage_publicacao"."ordem" ASC, 
    "homepage_publicacao"."inicio" DESC;

Queries are generated by Django's ORM (an ancient version) so I don't have full control over the SQL. I was wondering if I can do something to improve performance.

Explain returns this:

                                                                                                                      QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=5064.78..5065.62 rows=335 width=29) (actual time=527.259..527.294 rows=363 loops=1)
   Sort Key: homepage_publicacao.ordem, homepage_publicacao.inicio DESC
   Sort Method: quicksort  Memory: 53kB
   ->  Nested Loop  (cost=1507.04..5050.73 rows=335 width=29) (actual time=496.351..526.966 rows=363 loops=1)
         ->  Hash Join  (cost=1506.75..3158.42 rows=399 width=29) (actual time=496.286..497.268 rows=447 loops=1)
               Hash Cond: (homepage_publicacao_cidades.publicacao_id = homepage_publicacao.id)
               ->  Seq Scan on homepage_publicacao_cidades  (cost=0.00..1477.20 rows=45461 width=4) (actual time=0.016..221.486 rows=45398 loops=1)
                     Filter: (cidade_id = 31)
                     Rows Removed by Filter: 36937
               ->  Hash  (cost=1500.12..1500.12 rows=530 width=29) (actual time=200.071..200.071 rows=447 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 36kB
                     ->  Seq Scan on homepage_publicacao  (cost=0.00..1500.12 rows=530 width=29) (actual time=165.611..199.925 rows=447 loops=1)
                           Filter: ((inicio <= '2017-06-19 14:34:33.432977-03'::timestamp with time zone) AND (inicio <= '2017-06-19 14:34:33.432977-03'::timestamp with time zone) AND (final >= '2017-06-19 14:34:33.432977-03'::timestamp with time zone))
                           Rows Removed by Filter: 59903
         ->  Index Scan using homepage_oferta_pkey on homepage_oferta  (cost=0.29..4.73 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=447)
               Index Cond: (id = homepage_publicacao.oferta_id)
               Filter: (aprovada AND (combo <= 1) AND (site_id = 1))
               Rows Removed by Filter: 0
 Planning time: 0.864 ms
 Execution time: 527.473 ms
(20 rows)

[Update]

EXPLAIN (ANALYZE, BUFFERS):

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=3614.60..3615.44 rows=334 width=29) (actual time=765.149..765.168 rows=360 loops=1)
   Sort Key: homepage_publicacao.ordem, homepage_publicacao.inicio DESC
   Sort Method: quicksort  Memory: 53kB
   Buffers: shared hit=1878
   ->  Nested Loop  (cost=61.34..3600.60 rows=334 width=29) (actual time=553.062..764.856 rows=360 loops=1)
         Buffers: shared hit=1878
         ->  Hash Join  (cost=61.05..1712.71 rows=398 width=29) (actual time=553.037..553.901 rows=444 loops=1)
               Hash Cond: (homepage_publicacao_cidades.publicacao_id = homepage_publicacao.id)
               Buffers: shared hit=543
               ->  Seq Scan on homepage_publicacao_cidades  (cost=0.00..1477.20 rows=45461 width=4) (actual time=0.012..476.708 rows=45404 loops=1)
                     Filter: (cidade_id = 31)
                     Rows Removed by Filter: 36937
                     Buffers: shared hit=446
               ->  Hash  (cost=54.44..54.44 rows=529 width=29) (actual time=0.385..0.385 rows=444 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 36kB
                     Buffers: shared hit=97
                     ->  Index Scan using homepage_publicacao_final_idx on homepage_publicacao  (cost=0.29..54.44 rows=529 width=29) (actual time=0.025..0.308 rows=444 loops=1)
                           Index Cond: (final >= '2017-06-19 14:34:33.432977-03'::timestamp with time zone)
                           Filter: ((inicio <= '2017-06-19 14:34:33.432977-03'::timestamp with time zone) AND (inicio <= '2017-06-19 14:34:33.432977-03'::timestamp with time zone))
                           Rows Removed by Filter: 48
                           Buffers: shared hit=97
         ->  Index Scan using homepage_oferta_pkey on homepage_oferta  (cost=0.29..4.73 rows=1 width=4) (actual time=0.475..0.475 rows=1 loops=444)
               Index Cond: (id = homepage_publicacao.oferta_id)
               Filter: (aprovada AND (combo <= 1) AND (site_id = 1))
               Rows Removed by Filter: 0
               Buffers: shared hit=1335
 Planning time: 0.770 ms
 Execution time: 765.331 ms
(28 rows)

The top utility shows the following:

top - 17:00:16 up 10 days,  5:34,  6 users,  load average: 62,70, 60,89, 53,93
Tasks: 223 total,  71 running, 152 sleeping,   0 stopped,   0 zombie
%Cpu(s): 90,8 us,  5,3 sy,  0,0 ni,  0,0 id,  0,0 wa,  0,0 hi,  3,7 si,  0,2 st
KiB Mem : 32944564 total,  7517412 free,   387712 used, 25039440 buff/cache
KiB Swap:   262140 total,   262140 free,        0 used. 30954376 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
20523 postgres  20   0 4503004  47388  43804 R   3,3  0,1   0:01.54 postgres
20926 postgres  20   0 4503004  47944  44376 R   3,3  0,1   0:00.84 postgres
18350 postgres  20   0 4503028  47960  44372 R   3,0  0,1   0:05.44 postgres
18688 postgres  20   0 4503028  48152  44568 R   3,0  0,1   0:04.80 postgres
18813 postgres  20   0 4503140  47552  43860 R   3,0  0,1   0:04.58 postgres
18871 postgres  20   0 4503140  47620  43920 R   3,0  0,1   0:04.48 postgres
18917 postgres  20   0 4503140  47548  43860 R   3,0  0,1   0:04.41 postgres
19100 postgres  20   0 4503008  47432  43856 R   3,0  0,1   0:04.10 postgres
19334 postgres  20   0 4503300  48312  44564 R   3,0  0,1   0:03.66 postgres
19510 postgres  20   0 4503080  48796  45140 R   3,0  0,1   0:03.37 postgres
19657 postgres  20   0 4503300  45900  42256 R   3,0  0,1   0:03.08 postgres
19669 postgres  20   0 4503004  48464  44888 R   3,0  0,1   0:03.02 postgres
19753 postgres  20   0 4503140  67364  63696 R   3,0  0,2   0:02.94 postgres
20000 postgres  20   0 4503276  47848  44116 R   3,0  0,1   0:02.47 postgres
20005 postgres  20   0 4503004  47696  44116 R   3,0  0,1   0:02.47 postgres
20062 postgres  20   0 4503004  47572  43992 R   3,0  0,1   0:02.38 postgres
20198 postgres  20   0 4503004  47308  43728 R   3,0  0,1   0:02.13 postgres
20199 postgres  20   0 4503004  47500  43920 R   3,0  0,1   0:02.15 postgres
20335 postgres  20   0 4503004  47500  43920 R   3,0  0,1   0:01.87 postgres
20454 postgres  20   0 4503072  61988  58400 R   3,0  0,2   0:01.66 postgres
20495 postgres  20   0 4503004  48392  44820 R   3,0  0,1   0:01.57 postgres

Table definition:

uvarosa=# \d homepage_publicacao
                                       Table "public.homepage_publicacao"
      Column      |           Type           |                            Modifiers
------------------+--------------------------+------------------------------------------------------------------
 id               | integer                  | not null default nextval('homepage_publicacao_id_seq'::regclass)
 oferta_id        | integer                  | not null
 ordem            | integer                  | not null
 zerar_contadores | boolean                  | not null
 inicio           | timestamp with time zone | not null
 final            | timestamp with time zone | not null
Indexes:
    "homepage_publicacao_pkey" PRIMARY KEY, btree (id)
    "homepage_publicacao_final_idx" btree (final)
    "homepage_publicacao_inicio_final" btree (oferta_id, inicio, final)
    "homepage_publicacao_oferta_id" btree (oferta_id)
Foreign-key constraints:
    "homepage_publicacao_oferta_id_fkey" FOREIGN KEY (oferta_id) REFERENCES homepage_oferta(id) DEFERRABLE INITIALLY DEFERRED
Referenced by:
    TABLE "homepage_comune" CONSTRAINT "homepage_comune_publicacao_id_fkey" FOREIGN KEY (publicacao_id) REFERENCES homepage_publicacao(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "homepage_publicacaocombo" CONSTRAINT "homepage_publicacaocombo_publicacao_id_fkey" FOREIGN KEY (publicacao_id) REFERENCES homepage_publicacao(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "homepage_publicacao_cidades" CONSTRAINT "publicacao_id_refs_id_45ac5d7" FOREIGN KEY (publicacao_id) REFERENCES homepage_publicacao(id) DEFERRABLE INITIALLY DEFERRED


 uvarosa=# \d homepage_publicacao_cidades
                             Table "public.homepage_publicacao_cidades"
    Column     |  Type   |                                Modifiers
---------------+---------+--------------------------------------------------------------------------
 id            | integer | not null default nextval('homepage_publicacao_cidades_id_seq'::regclass)
 publicacao_id | integer | not null
 cidade_id     | integer | not null
Indexes:
    "homepage_publicacao_cidades_pkey" PRIMARY KEY, btree (id)
    "homepage_publicacao_cidades_publicacao_id_cidade_id_key" UNIQUE CONSTRAINT, btree (publicacao_id, cidade_id)
    "homepage_publicacao_cidades_cidade_id" btree (cidade_id)
    "homepage_publicacao_cidades_cidade_id_publicacao_id_idx" btree (cidade_id, publicacao_id)
    "homepage_publicacao_cidades_publicacao_id" btree (publicacao_id)
Foreign-key constraints:
    "homepage_publicacao_cidades_cidade_id_fkey" FOREIGN KEY (cidade_id) REFERENCES homepage_cidade(id) DEFERRABLE INITIALLY DEFERRED
    "publicacao_id_refs_id_45ac5d7" FOREIGN KEY (publicacao_id) REFERENCES homepage_publicacao(id) DEFERRABLE INITIALLY DEFERRED

The table homepage_publicacao_cidades is used for a many-to-many relationship between tables homepage_publicacao and homepage_cidade by the Django ORM. When the site started a few years ago they had many cities (cidades), but today there are only 3 operating and most of the deals are for city 31 (Sao Paulo) – so the cardinality is not very good. That said, the number of records is not that big specially for a machine with 32 GB of RAM, so I must be doing something wrong.

Best Answer

The most obvious thing to try would be:

create index on homepage_publicacao (final);

Which and have now done, and it did help but now something else got worse for unknown reasons.

And perhaps:

create index on homepage_publicacao_cidades (cidade_id,publicacao_id);

For the index only scan (although since half the table has cidade_id = 31, this might not be so effective.)

So on second thought, this one should be better:

create index on homepage_publicacao_cidades (publicacao_id,cidade_id);

They should be equivalent for the query you show, but the second ordering should be better for a wider variety of other queries. I think you had such an index in an earlier version of your question (it showed up in the explain plan you accidentally posted for a different query), but it doesn't seem to be there now.

If those aren't enough, knowing how many rows satisfy each of your individual conditions from the where clause would help.

Do you have non-default values for seq_page_cost and random_page_cost?

Finally, the query on homepage_oferta_pkey takes 100 times longer in one of your plans than it does in the other one, but there is no evident reason for that to be so. This suggests that your server may be so overloaded that its performance is unstable and so will be hard to use for tuning by trial and error. Do you have a more quiet test/QA server can use for tuning?