PostgreSQL 9.6 – Speeding Up Slow SELECT Query

performancepostgresqlquery-performance

In a app called Links, users post photos of interesting content they've discovered recently (and others can vote or comment on them).

These posted photos are saved in a links_photo table in my postgresql 9.6.5 DB.

Each user's profile shows their posted photos – paginated by 10 objects each – ordered by upload time.

One SELECT query on the links_photo table is consistently showing up in slow_log. It's taking longer than 500ms, and is ~10X slower than what I'm experiencing in most other postgresql operations.

Here's an example of the corresponding SQL from my slow log:

LOG: duration: 542.755 ms statement:

SELECT "links_photo"."id",
       "links_photo"."owner_id",
       "links_photo"."image_file",
       "links_photo"."upload_time",
       "links_photo"."comment_count",
       "links_photo"."vote_score",
       "links_photo"."caption",
       "links_photo"."category",
       "auth_user"."id",
       "auth_user"."username",
       "auth_user"."date_joined",
       "links_userprofile"."id",
       "links_userprofile"."user_id",
       "links_userprofile"."score",
       "links_userprofile"."avatar"
FROM   "links_photo"
       INNER JOIN "auth_user"
               ON ( "links_photo"."owner_id" = "auth_user"."id" )
       LEFT OUTER JOIN "links_userprofile"
                    ON ( "auth_user"."id" = "links_userprofile"."user_id" )
WHERE  ( "links_photo"."owner_id" = 78689
         AND "links_photo"."category" = '1' )
ORDER  BY "links_photo"."upload_time" DESC
LIMIT  10 offset 10

See the the explain analyze results: https://explain.depesz.com/s/DPJo

According to that, the Index Scan Backward ends up filtering 1,196,188 rows and is the source of slowness.

What I think I should try:

Being an accidental DBA of sorts, I'm looking for some quick expert guidance on the subject. I would have thought having an index on upload_time would suffice, but it doesn't. So perhaps a composite one on (owner_id, upload_time DESC)?

Addition:

Here's the entire output for \d links_photo:

                                           Table "public.links_photo"
          Column          |           Type           |                        Modifiers                         
--------------------------+--------------------------+----------------------------------------------------------
 id                       | integer                  | not null default nextval('links_photo_id_seq'::regclass)
 owner_id                 | integer                  | not null
 image_file               | character varying(100)   | not null
 upload_time              | timestamp with time zone | not null
 comment_count            | integer                  | not null
 is_public                | boolean                  | not null
 vote_score               | integer                  | not null
 visible_score            | integer                  | not null
 invisible_score          | double precision         | not null
 caption                  | character varying(100)   | 
 avg_hash                 | character varying(16)    | not null
 latest_comment_id        | integer                  | 
 second_latest_comment_id | integer                  | 
 category                 | character varying(11)    | not null
 device                   | character varying(10)    | not null
Indexes:
    "links_photo_pkey" PRIMARY KEY, btree (id)
    "links_photo_latest_comment_id" btree (latest_comment_id)
    "links_photo_owner_id" btree (owner_id)
    "links_photo_second_latest_comment_id" btree (second_latest_comment_id)
    "links_photo_upload_time" btree (upload_time)
Foreign-key constraints:
    "latest_comment_id_refs_id_f2566197" FOREIGN KEY (latest_comment_id) REFERENCES links_photocomment(id) DEFERRABLE INITIALLY DEFERRED
    "links_photo_owner_id_fkey" FOREIGN KEY (owner_id) REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED
    "second_latest_comment_id_refs_id_f2566197" FOREIGN KEY (second_latest_comment_id) REFERENCES links_photocomment(id) DEFERRABLE INITIALLY DEFERRED
Referenced by:
    TABLE "links_photostream" CONSTRAINT "cover_id_refs_id_d62b783f" FOREIGN KEY (cover_id) REFERENCES links_photo(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "links_photocomment" CONSTRAINT "links_photocomment_which_photo_id_fkey" FOREIGN KEY (which_photo_id) REFERENCES links_photo(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "links_photoobjectsubscription" CONSTRAINT "links_photoobjectsubscription_which_photo_id_fkey" FOREIGN KEY (which_photo_id) REFERENCES links_photo(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "links_photovote" CONSTRAINT "links_photovote_photo_id_fkey" FOREIGN KEY (photo_id) REFERENCES links_photo(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "links_report" CONSTRAINT "links_report_which_photo_id_fkey" FOREIGN KEY (which_photo_id) REFERENCES links_photo(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "links_photo_which_stream" CONSTRAINT "photo_id_refs_id_916b4355" FOREIGN KEY (photo_id) REFERENCES links_photo(id) DEFERRABLE INITIALLY DEFERRED

Best Answer

I have the same case like you. The problem is Rows Removed by Filter, the query takes time to remove rows by its filter. Depending on how many rows will be removed, the time will be equivalent.

Idea 1

Your approach is fine. The small thing is the query needs to remove rows on the category. However, it is not a big problem if your cardinality on category is not much.

Idea 2

I would suggest to create index on owner_id, category, upload_time DESC. It will avoid the filter. But, you should consider the size of index in case of large table.

Let say we have the table (db<>fiddle)

CREATE TABLE links_photo 
(
  id serial, 
  owner_id int, 
  category int, 
  upload_time timestamp without time zone
);

And the query

SELECT id
FROM links_photo 
WHERE owner_id = 100 and category = 2 
ORDER BY upload_time desc 
LIMIT 10 
OFFSET 0;

The explanation of using upload_time DESC

The explanation of using owner_id, upload_time DESC

The explanation of using owner_id, category, upload_time DESC