PostgreSQL Performance – How to Get Rid of Bitmap Heap Scan with Proper Indices

indexperformancepostgresqlquery-performance

I am running PostgreSQL 9.6. These are the relevant definitions:

CREATE TABLE IF NOT EXISTS instagram.profiles_1000 (
    id                          SERIAL PRIMARY KEY,
    username                    VARCHAR(255) NOT NULL UNIQUE,
    followers                   BIGINT,
    tsv                         TSVECTOR
);

CREATE UNIQUE INDEX IF NOT EXISTS instagram_username_index
    ON instagram.profiles_1000(username);
CREATE INDEX IF NOT EXISTS instagram_followers_index
    ON instagram.profiles_1000(followers);
CREATE INDEX IF NOT EXISTS instagram_textsearch_index
    ON instagram.profiles_1000 USING GIN(tsv);

And the text vector is updated by a trigger:

CREATE FUNCTION instagram_documents_search_trigger() RETURNS trigger AS $$
begin
  new.tsv :=
        setweight(to_tsvector(COALESCE(new.username, '')), 'D') || ' ' ||
        setweight(to_tsvector(COALESCE(new.full_name, '')), 'C') || ' ' ||
        setweight(to_tsvector(COALESCE(new.location_country, '')), 'B') || ' ' ||
        setweight(to_tsvector(COALESCE(new.location_region, '')), 'B') || ' ' ||
        setweight(to_tsvector(COALESCE(new.biography, '')), 'A') || ' ' ||
        setweight(to_tsvector(COALESCE(new.location_city, '')), 'A');
  return new;
end
$$ LANGUAGE plpgsql;


CREATE TRIGGER instagram_tsvectorupdate BEFORE INSERT OR UPDATE
    ON instagram.profiles_1000 FOR EACH ROW
    EXECUTE PROCEDURE instagram_documents_search_trigger();

This is the query:

select instagram.profiles_1000.*, categories, followers as rank                                                                                            
from instagram.profiles_1000
join plainto_tsquery('arts') as q on q @@ tsv
left outer join instagram.profile_categories_agg on instagram.profiles_1000.username = instagram.profile_categories_agg.username
where followers is not null and followers > 0
order by (followers, -id) desc
limit 50;

This is the output of EXPLAIN (ANALYZE, BUFFERS):

https://explain.depesz.com/s/ceCd

The culprit is the Bitmap Heap Scan which makes up the bulk of the total execution time. Frankly I don't understand why it's needed, especially since the Bitmap Index Scan on instagram_textsearch_index already filters the rows according to the search term.

Can someone shed some light?

EDIT It was pointed out that I misread the explain output. Indeed, the left outer join was taking a lot of time. I tried to remove it as follows:

select instagram.profiles_1000.*, followers as rank
from instagram.profiles_1000
join plainto_tsquery('arts') as q on q @@ tsv                                              
where followers is not null and followers > 0
order by (followers, -id) desc
limit 50;

But the query still takes 13 seconds! This is the EXPLAIN (ANALYZE, BUFFERS) output:

https://explain.depesz.com/s/awfH

Now the bottleneck seems to be the full-text search. Is it really that slow? The table has just 5 million rows, and the tsv (which has type TSVECTOR) is indexed by the following index:

CREATE INDEX IF NOT EXISTS instagram_textsearch_index_1000
    ON instagram.profiles_1000 USING GIN(tsv);

EDIT 2 I realized that I can write a leaner query if I only process the profiles that match the search (which are always 50 at most). Using this query:

select p.*, categories
from
    (select id
    from instagram.profiles_1000, plainto_tsquery('arts') as q
    where q @@ tsv and followers is not null and followers > 0
    order by (followers, -id) desc
    limit 50) as ids
inner join instagram.profiles_1000 as p on
    p.id = ids.id
left outer join instagram.profile_categories_agg as c on
    c.username = p.username;

I am able to obtain this result:
https://explain.depesz.com/s/OvG

Which puts the search at ~3 seconds. It would be nicer to reach 1 second at least though.

Best Answer

If you want improve further on the timing, your best bet might be to abandon use of the FTS index, at least for cases where the @@ match criteria returns a lot of results.

First you would have to change your ORDER BY from order by (followers, -id) desc to order by followers desc, id. This version is semantically equivalent (except perhaps in how it handles NULL values) but it does not go through the step of having to package up the two columns into a pseudo-row and then sorting those row values. It sorts on the column values directly. This direct sorting is much faster, but more importantly it opens up the possibility to use an index, rather than a sort, to fulfill the ORDER BY.

Then if you create an index on (followers desc, id), your query can step through that index looking for rows that satisfy the @@ condition, stopping once it finds 50 of them. Doing it this way could be much faster than pulling out over 100,000 rows that are @@ matches and sorting them to pull out the top 50.

Related Solutions

Postgresql – Optimizing ORDER BY in a full text search query

What I still don't understand, is why this is slower.

That sorting the rows will cost something is obvious. But why so much?
Without ORDER BY rank0... Postgres can just

pick the first 5 rows it finds and stop fetching rows as soon as it has 5.

Bitmap Heap Scan on entities ... rows=5 ...
then compute ts_rank() for just 5 rows.

In the second case, Postgres has to

fetch all (1495 according to your query plan) rows that qualify.

Bitmap Heap Scan on entities ... rows=1495 ...
compute ts_rank() for all of them.
sort all of them to find the first 5 according to the calculated value.

Try ORDER BY name just to see the cost of computing to_tsquery('english', 'hockey'::text)) for the superfluous rows and how much remains for fetching more rows and sorting.

PostgreSQL – Understanding Bitmap Heap Scan and Bitmap Index Scan

How does PostgreSQL knows by just a bitmap anything about rows' physical order?

The bitmap is one bit per heap page. The bitmap index scan sets the bits based on the heap page address that the index entry points to.

So when it goes to do the bitmap heap scan, it just does a linear table scan, reading the bitmap to see whether it should bother with a particular page or seek over it.

Or generates the bitmap so that any element of it can be mapped to the pointer to a page easily?

No, the bitmap corresponds 1:1 to heap pages.

I wrote some more on this here.

OK, it looks like you might be misunderstanding what "bitmap" means in this context.

It's not a bit string like "101011" that's created for each heap page, or each index read, or whatever.

The whole bitmap is a single bit array, with as many bits as there are heap pages in the relation being scanned.

One bitmap is created by the first index scan, starting off with all entries 0 (false). Whenever an index entry that matches the search condition is found, the heap address pointed to by that index entry is looked up as an offset into the bitmap, and that bit is set to 1 (true). So rather than looking up the heap page directly, the bitmap index scan looks up the corresponding bit position in the bitmap.

The second and further bitmap index scans do the same thing with the other indexes and the search conditions on them.

Then each bitmap is ANDed together. The resulting bitmap has one bit for each heap page, where the bits are true only if they were true in all the individual bitmap index scans, i.e. the search condition matched for every index scan. These are the only heap pages we need to bother to load and examine. Since each heap page might contain multiple rows, we then have to examine each row to see if it matches all the conditions - that's what the "recheck cond" part is about.

One crucial thing to understand with all this is that the tuple address in an index entry points to the row's ctid, which is a combination of the heap page number and the offset within the heap page. A bitmap index scan ignores the offsets, since it'll check the whole page anyway, and sets the bit if any row on that page matches the condition.

Graphical example

Heap, one square = one page:
+---------------------------------------------+
|c____u_____X___u___X_________u___cXcc______u_|
+---------------------------------------------+
Rows marked c match customers pkey condition.
Rows marked u match username condition.
Rows marked X match both conditions.


Bitmap scan from customers_pkey:
+---------------------------------------------+
|100000000001000000010000000000000111100000000| bitmap 1
+---------------------------------------------+
One bit per heap page, in the same order as the heap
Bits 1 when condition matches, 0 if not

Bitmap scan from ix_cust_username:
+---------------------------------------------+
|000001000001000100010000000001000010000000010| bitmap 2
+---------------------------------------------+

Once the bitmaps are created a bitwise AND is performed on them:

+---------------------------------------------+
|100000000001000000010000000000000111100000000| bitmap 1
|000001000001000100010000000001000010000000010| bitmap 2
 &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
|000000000001000000010000000000000010000000000| Combined bitmap
+-----------+-------+--------------+----------+
            |       |              |
            v       v              v
Used to scan the heap only for matching pages:
+---------------------------------------------+
|___________X_______X______________X__________|
+---------------------------------------------+

The bitmap heap scan then seeks to the start of each page and reads the page:

+---------------------------------------------+
|___________X_______X______________X__________|
+---------------------------------------------+
seek------->^seek-->^seek--------->^
            |       |              |
            ------------------------
            only these pages read

and each read page is then re-checked against the condition since there can be >1 row per page and not all necessarily match the condition.

Best Answer

Related Solutions

Postgresql – Optimizing ORDER BY in a full text search query

PostgreSQL – Understanding Bitmap Heap Scan and Bitmap Index Scan

Related Question