Postgresql – Index for a Postgres query with a sort and an equality

indexoptimizationperformancepostgresqlpostgresql-performance

This query is very slow:

EXPLAIN (ANALYZE, buffers) SELECT *
FROM
    "Follows" AS "Follow"
INNER JOIN "Users" AS "followee" ON "Follow"."followeeId" = "followee"."id"
WHERE
    "Follow"."followerId" = 169368
ORDER BY
    "Follow"."createdAt" DESC
LIMIT 1000;

Here is the explain:

Limit  (cost=0.86..2120.08 rows=141 width=814) (actual time=0.776..239.289 rows=262 loops=1)
  Buffers: shared hit=750 read=673 dirtied=1
  ->  Nested Loop  (cost=0.86..2120.08 rows=141 width=814) (actual time=0.774..239.148 rows=262 loops=1)
        Buffers: shared hit=750 read=673 dirtied=1
        ->  Index Scan using follows_followinglist_followerid_createdat_idx on "Follows" "Follow"  (cost=0.43..681.88 rows=170 width=41) (actual time=0.377..52.687 rows=262 loops=1)
              Index Cond: ("followerId" = 169368)
              Buffers: shared hit=149 read=115
        ->  Index Scan using "Users_pkey" on "Users" followee  (cost=0.43..8.45 rows=1 width=773) (actual time=0.559..0.709 rows=1 loops=262)
              Index Cond: (id = "Follow"."followeeId")
              Buffers: shared hit=601 read=558 dirtied=1
Total runtime: 239.545 ms

I have the following indexes (currently, it is using the first one):

CREATE INDEX follows_followinglist_followerid_createdat_idx ON "Follows"
  ("followerId", "createdAt" DESC)
CREATE INDEX follows_followinglist_followerid_createdat_idx2 ON "Follows"
  ("createdAt" DESC, "followerId")
CREATE INDEX follows_followerId_fk_index ON "Follows" ("followerId");
CREATE INDEX  "follows_createdat_index" ON "public"."Follows" USING btree("createdAt" DESC);

Before I added the first two, the cost was slightly lower (~2000) using follows_followerId_fk_index, but still too slow.

I'm wondering if there's a way to optimize this further.

Best Answer

Simplified query to make it readable:

SELECT *
FROM   "Follows" f
JOIN   "Users"   u ON f."followeeId" = u."id"
WHERE  f."followerId" = 169368
ORDER  BY f."createdAt" DESC
LIMIT  1000;

Your index follows_followinglist_followerid_createdat_idx looks good for the job. In Postgres 9.2+ it might get a bit faster if you append "followeeId" to the index - if and only if you can get an index-only scan out of this. Maybe not possible in your case.

CREATE INDEX follows_foo_idx ON "Follows"("followerId", "createdAt" DESC, "followeeId");

Generally, this query won't get lighning fast, since you have to fetch up to 1000 random rows from "Users". The key to performance will be:

  • Optimized table layout of "Users" to keep the table as small as possible, so fewer blocks have to be read and more of it can stay in cache.
  • Plenty of RAM and cache settings accordingly if your table "Users" is big.

It would also help to cluster the "Follows" table from time to time. Details: