Postgresql – Speeding up index scan backwards query

indexindex-tuningpostgresqlpostgresql-performance

My app is making the following psql query, and it is running extremely slow:

SELECT COUNT(*) 
FROM (
  SELECT 1 AS one 
  FROM "large_table" 
  WHERE "large_table"."user_id" = 123 
  ORDER BY "large_table"."id" desc 
  LIMIT 1 OFFSET 30
) subquery_for_count;

When I change the ORDER BY to ASC, it runs like 100x quicker. I have the default primary key index on id, and I've experimented with adding an additional index for the id in desc order, however it didn't seem to make a difference.

When I run Explain Analyze, I see that it is using an index scan backwards on the slow query (desc). I tried manually disabling index scans for my session, and found that the query ran in 40seconds instead of 2 minutes, which is a noticeable improvement.

Any idea on what I can do to try and improve the speed of this query when sorting by DESC? I've read that for b-tree indices, it should generally give you the same performance irregardless of sort order, but that does not seem to be the case.

Best Answer

Your query must be using an index on "id" to scan the index in the implied order, and then filtering out everything where "user_id" does not equal 123, stopping after it finds 31 rows which survive the filter. Going in one direction it quickly finds 31 such rows, going in the other direction needs to filter out a large number of rows before 31 survive (because none/few of the rows starting at that end have user_id=123).

You could readily confirm this theory by doing an EXPLAIN (ANALYZE, BUFFERS) of the queries.

This is not fundamentally about the order of the index scan. If you picked a value for 123 which had the opposite property (they all occurred at the logical end of the index rather than the logical beginning) then the situation would be reversed. Specifying DESC would fix the problem, rather than causing it.

Any idea on what I can do to try and improve the speed of this query when sorting by DESC?

Your query seems pointless. Counting is not an order-dependent activity. This is probably not your real query. So who knows if our suggestions would transfer over to your real query? The most straighforward fix for this query would be to build a multicolumn index on (user_id, id). Then no rows would get filtered out one by one, as they would be removed in wholesale through the operation of the index.