Postgresql – Postgres sometimes uses inferior index for WHERE a IN (…) ORDER BY b LIMIT N

index-tuningpagingperformancepostgresqlpostgresql-9.6query-performance

We have a PostgreSQL table with ~5 billion rows that has developed a nasty habit of missing the proper indices and doing a Primary Key scan on certain LIMIT operations.

The problem generally manifests on an ORDER BY .. LIMIT .. clause (a common pattern in Django pagination) where the LIMIT is some relatively small subset of the results matched by the index. An extreme example is this:

SELECT * FROM mcqueen_base_imagemeta2 
  WHERE image_id IN ( 123, ... )
  ORDER BY id DESC
  LIMIT 1;

where the items in that IN clause are ~20 and total rows matched by the index on image_id is 16.

The EXPLAIN shows that it misses the image_id index and instead does a PK scan of 5B rows:

Limit  (cost=0.58..4632.03 rows=1 width=28)
   ->  Index Scan Backward using mcqueen_base_imagemeta2_pkey on mcqueen_base_imagemeta2  (cost=0.58..364597074.75 rows=78722 width=28)
         Filter: (image_id = ANY ('{123, ...}'::bigint[]))

If the LIMIT is increased to 2, it works as expected:

Limit  (cost=7585.92..7585.93 rows=2 width=28)
   ->  Sort  (cost=7585.92..7782.73 rows=78722 width=28)
         Sort Key: id DESC
         ->  Index Scan using mcqueen_base_imagemeta2_image_id_616fe89c on mcqueen_base_imagemeta2  (cost=0.58..6798.70 rows=78722 width=28)
               Index Cond: (image_id = ANY ('{123, ...}'::bigint[]))

This also happens on queries where the index matches ~3000 rows and the limit is set to 100, so something that easily happens in real world REST API pagination.

The table definition is:

mcqueen=# \d mcqueen_base_imagemeta2
                                       Table "public.mcqueen_base_imagemeta2"
      Column       |           Type           |                              Modifiers                               
-------------------+--------------------------+----------------------------------------------------------------------
 id                | bigint                   | not null default nextval('mcqueen_base_imagemeta2_id_seq'::regclass)
 created_at        | timestamp with time zone | not null
 image_id          | bigint                   | not null
 key_id            | smallint                 | not null
 source_version_id | smallint                 | not null
Indexes:
    "mcqueen_base_imagemeta2_pkey" PRIMARY KEY, btree (id)
    "mcqueen_base_imagemeta2_image_id_616fe89c" btree (image_id)
    "mcqueen_base_imagemeta2_key_id_a4854581" btree (key_id)
    "mcqueen_base_imagemeta2_source_version_id_f9b0513e" btree (source_version_id)
Foreign-key constraints:
    "mcqueen_base_imageme_image_id_616fe89c_fk_mcqueen_b" FOREIGN KEY (image_id) REFERENCES mcqueen_base_image(id) DEFERRABLE INITIALLY DEFERRED
    "mcqueen_base_imageme_key_id_a4854581_fk_mcqueen_b" FOREIGN KEY (key_id) REFERENCES mcqueen_base_metakey(id) DEFERRABLE INITIALLY DEFERRED
    "mcqueen_base_imageme_source_version_id_f9b0513e_fk_mcqueen_b" FOREIGN KEY (source_version_id) REFERENCES mcqueen_base_metasourceversion(id) DEFERRABLE INITIALLY DEFERRED

I'm a novice at best when it comes to tuning, but I figure that the defaults for statistics are not up to that table's size and so it naively thinks that a PK scan is faster than an index scan.

Best Answer

It thinks it is going to find 78722, but it really finds 16, so that is going to lead to some bad plans.

When a value in the in-list is not present in the MCV list of the stats table, it guesses their frequency using the n_distinct value, which is probably way off (you didn't answer my question about that). The way it does this is to take the number of tuples not covered by the MCV frequency list and divides it by the number of distinct values not listed in the MCV list. So basically ntuples * (1-sum of MCF) / (n_distinct - length of MCF). This simplified formula ignores NULLs.

As @ErwinBrandstetter suggests, you might be able to improve the situation by increasing the size of the MCV list by increasing the statistics sample size. That might also increase the accuracy of the n_distinct estimate. But with 6 billions rows, it might not possible to increase the sample size by enough. Also, if image_id are clumped together with the duplicate values likely to occur in the same page, then the sampling method used by PostgreSQL is quite biased when it comes to computing n_distinct, and this is resistant to fixing by just increasing the sample size.

A simpler way to fix this may be to fix the n_distinct manually:

alter table mcqueen_base_imagemeta2 alter column image_id set (n_distinct=1000000000);
analyze mcqueen_base_imagemeta2;

This method doesn't increase the time or storage required by ANALYZE, the way increasing the sample size does, and is also more likely to succeed.

Related Solutions

Postgresql – Postgres multiple joins slow query, how to store default child record

You write:

Each customer can have multiple sites, but only one should be displayed in this list.

Yet, your query retrieves all rows. That would be a point to optimize. But you also do not define which site is to be picked.

Either way, it does not matter much here. Your EXPLAIN shows only 5026 rows for the site scan (5018 for the customer scan). So hardly any customer actually has more than one site. Did you ANALYZE your tables before running EXPLAIN?

From the numbers I see in your EXPLAIN, indexes will give you nothing for this query. Sequential table scans will be the fastest possible way. Half a second is rather slow for 5000 rows, though. Maybe your database needs some general performance tuning?

Maybe the query itself is faster, but "half a second" includes network transfer? EXPLAIN ANALYZE would tell us more.

If this query is your bottleneck, I would suggest you implement a materialized view.

After you provided more information I find that my diagnosis pretty much holds.

The query itself needs 27 ms. Not much of a problem there. "Half a second" was the kind of misunderstanding I had suspected. The slow part is the network transfer (plus ssh encoding / decoding, possibly rendering). You should only retrieve 100 rows, that would solve most of it, even if it means to execute the whole query every time.

If you go the route with a materialized view like I proposed you could add a serial number without gaps to the table plus index on it - by adding a column row_number() OVER (<your sort citeria here>) AS mv_id.

Then you can query:

SELECT *
FROM   materialized_view
WHERE  mv_id >= 2700
AND    mv_id <  2800;

This will perform very fast. LIMIT / OFFSET cannot compete, that needs to compute the whole table before it can sort and pick 100 rows.

pgAdmin timing

When you execute a query from the query tool, the message pane shows something like:

Total query runtime: 62 ms.

And the status line shows the same time. I quote pgAdmin help about that:

The status line will show how long the last query took to complete. If a dataset was returned, not only the elapsed time for server execution is displayed, but also the time to retrieve the data from the server to the Data Output page.

If you want to see the time on the server you need to use SQL EXPLAIN ANALYZE or the built in Shift + F7keyboard shortcut or Query -> Explain analyze. Then, at the bottom of the explain output you get something like this:

Total runtime: 0.269 ms

Postgresql – How to speed up a Postgres query containing lots of Joins with an ILIKE condition

I see a couple of issues.

The biggest one is that PG is using a sequence scan on A when filtering A. I think you need a composite index on A.flag AND A.strvalue. If there is already an index available, PostgreSQL is choosing not to use it for some reason. This seems to be eating up 92% of your cost estimate and is likely what's making it run for so long.

As for the ILIKE, PostgreSQL cannot natively (but see below for a module that can) use an index as long as your wildcard is the first character. That's simply a restriction on the ILIKE operator. For that reason you are getting a sequence scan which means every single row is being loaded and the C.name column is being scanned for characters. But one thing that's weird is that the ILIKE sequence scan doesn't seem to be eating up much of the cost estimate in this query plan. Anyway, if it is the ILIKE operator causing the slowdown, I would consider rewriting your query so that it somehow looks like this: ILIKE 'value%' or else consider using PostgreSQL's full text search.

UPDATED

The ILIKE operator can use a trigram index. Superb!

Best Answer

Related Solutions

Postgresql – Postgres multiple joins slow query, how to store default child record

pgAdmin timing

Postgresql – How to speed up a Postgres query containing lots of Joins with an ILIKE condition

Related Question