PostgreSQL Index Tuning – BitmapIndexScan vs IndexScan

indexpostgresql

I have the following test case:

CREATE SCHEMA IF NOT EXISTS long;
SET search_path TO long,pg_temp,pg_catalog;
CREATE TABLE long WITH (autovacuum_enabled=off) AS
SELECT id, (random()*1000)::int AS key, random() AS value1, random() AS value2,
       repeat((random()*9)::int::text, 312) AS long
  FROM generate_series(1,10000000) id;
CREATE INDEX i_long_value1 ON long (value1);
CREATE INDEX i_long_wo ON long(key,value2 DESC);
VACUUM ANALYZE long;

Now, if I explain the following query:

SELECT * FROM long WHERE key=5 AND value1 BETWEEN 0.5 AND 0.6 ORDER BY value2 DESC;

I always get the following plan.
This is strange, for I have a matching index i_long_wo. So I use: SET enable_bitmapscan TO off; and this results in the much better plan. This is reproducible on all PG instances I've tried (win64, HP-UX, MacOS) and not only on 9.4beta2. Also, I've run these queries several times prior to looking into the plans, to makes sure all information is cached, you can check Buffers: shared hit entries in the presented plans.

I am running on a (same effect observed on 9.1-9.4b2 on MacOS also):

postgres=# SELECT version();
                            version
----------------------------------------------------------------
 PostgreSQL 9.4beta2, compiled by Visual C++ build 1800, 64-bit

postgres=# SELECT name,setting,unit,source FROM pg_settings
            WHERE NOT source IN ('default','override');
             name             |      setting       | unit |        source
------------------------------+--------------------+------+----------------------
 application_name             | psql               |      | client
 bgwriter_delay               | 50                 | ms   | configuration file
 bgwriter_lru_maxpages        | 350                |      | configuration file
 checkpoint_completion_target | 0.9                |      | configuration file
 checkpoint_segments          | 100                |      | configuration file
 client_encoding              | WIN1252            |      | client
 DateStyle                    | ISO, MDY           |      | configuration file
 default_text_search_config   | pg_catalog.english |      | configuration file
 dynamic_shared_memory_type   | windows            |      | configuration file
 effective_cache_size         | 524288             | 8kB  | configuration file
 lc_messages                  | C                  |      | configuration file
 lc_monetary                  | C                  |      | configuration file
 lc_numeric                   | C                  |      | configuration file
 lc_time                      | C                  |      | configuration file
 listen_addresses             | *                  |      | configuration file
 log_destination              | stderr             |      | configuration file
 log_line_prefix              | %t                 |      | configuration file
 log_timezone                 | Europe/Helsinki    |      | configuration file
 logging_collector            | on                 |      | configuration file
 max_connections              | 100                |      | configuration file
 max_stack_depth              | 2048               | kB   | environment variable
 port                         | 5432               |      | configuration file
 shared_buffers               | 131072             | 8kB  | configuration file
 TimeZone                     | Europe/Helsinki    |      | configuration file
 wal_buffers                  | 2048               | 8kB  | configuration file
 work_mem                     | 16384              | kB   | configuration file
(26 rows)

Windows instance is using 9.4beta2 installer from EDB, I will update compile details as soon as I am around. Configuration used to build MacOS instance:

CONFIGURE = '--prefix=/usr/local/Cellar/postgresql-9.4/9.4beta2' '--enable-dtrace' \
  '--with-bonjour' '--with-gssapi' '--with-ldap' '--with-libxml' \
  '--with-libxslt' '--with-openssl' '--with-uuid=e2fs' '--with-pam' \
  '--with-perl' '--with-python' '--with-tcl' \
'CC=/usr/bin/clang' \
'CFLAGS=-Os -w -pipe -march=native -mmacosx-version-min=10.9' \
'LDFLAGS=-L/usr/local/opt/readline/lib -L/usr/local/opt/e2fsprogs/lib \
  -L/usr/local/opt/gettext/lib -L/usr/local/lib \
  -Wl,-headerpad_max_install_names' \
'CPPFLAGS=-I/usr/local/opt/readline/include -I/usr/local/opt/e2fsprogs/include \
  -I/usr/local/opt/gettext/include'
LIBS = -lpgcommon -lpgport -lxslt -lxml2 -lpam -lssl -lcrypto -lgssapi_krb5 \
  -lz -lreadline -lm

So I am wondering — why BitmapIndexScan got better estimates?

I hit this post from Tom Lane (quite old) and as I understand, recommended way is to tune *_cost parameters. But thing is — cost is affected by the number of pages more, this is seen by the number of buffers PostgreSQL touches during execution, they comparable for both plans and are higher then the number of records returned. Therefore adjusting random_page_cost lowers cost for both plans and IndexScan never wins.

I tried increasing cpu_index_tuple_cost, I had to bump it up to 0.02 to make planner stop using BitmapAnd. In this setup BitmapIndexScan varian looks much better, but still looses to the IndexScan.

Is it possible to make planner give a better estimate for IndexScan here?

Best Answer

If I run your example with a cold cache for both queries, then the bitmap index scan really does win out. So in this sense, the planner is getting it correct.

Since you are testing under a perfectly hot cache, the correct thing to do would be to lower the random_page_cost and the seq_page_cost both to zero, and if I do that it does select the index scan you prefer (and at that point, I can't even get it to use the BitMapAnd, no matter what I do to the enable_* parameters).

This is not to say that the planner is doing a good job here. It is vastly underestimating the amount of work that needs to be done to insert a tuple into a bitmap. It assumes it takes 0.1 of a cpu_operator_cost to do so, which seems facially absurd to me.

If you measure your buffer hit/miss rates for this particular query under realistic conditions, rather than under benchmarking conditions, what do they look like? It would be nice to do tests under realistic conditions, but those are very hard to achieve so I can just do all-hot or all-cold and hope it interpolates correctly