PostgreSQL – Why Hash Index May Not Be Faster Than Btree for Equality Lookups

btreehashingindexpostgresql

For every version of Postgres that supported hash indexing, there is a warning or note that hash indexes are "similar or slower" or "not better" than btree indexes, at least up to version 8.3. From the docs:

Version 7.2:

Note: Because of the limited utility of hash indexes, a B-tree index should generally be preferred over a hash index. We do not have sufficient evidence that hash indexes are actually faster than B-trees even for = comparisons. Moreover, hash indexes require coarser locks; see Section 9.7.

Version 7.3 (and up to 8.2):

Note: Testing has shown PostgreSQL's hash indexes to be similar or slower than B-tree indexes, and the index size and build time for hash indexes is much worse. Hash indexes also suffer poor performance under high concurrency. For these reasons, hash index use is discouraged.

Version 8.3:

Note: Testing has shown PostgreSQL's hash indexes to perform no better than B-tree indexes, and the index size and build time for hash indexes is much worse. Furthermore, hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash. For these reasons, hash index use is presently discouraged.

In this version 8.0 thread, they claim that had never found a case where hash indexes were actually faster than btree.

Even in version 9.2, the performance gain for anything other than writing the actual index was almost nothing according to this blog post (14 March 2016):
Hash Indexes on Postgres by André Barbosa.

My question is how is that possible?

By definition, Hash indexes are a O(1) operation, where a btree is an O(log n) operation. So how is it possible for a O(1) lookup to be slower than (or even similar to) finding the correct branch, and then finding the correct record?

I want to know what about indexing theory could EVER make that a possibility!

Best Answer

Disk based Btree indexes truly are O(log N), but that is pretty much irrelevant for disk arrays that fit in this solar system. Due to caching, they are mostly O(1) with a very large constant plus O((log N)-1) with a small constant. Formally, that is the same thing as O(log N), because constants don't matter in big O notation. But they do matter in reality.

Much of the slow down in hash index lookups came from the need to protect against corruption or deadlocks caused by hash-table resizing concurrent with the lookups. Until recent versions (every version you mention is comically out of date), this need led to even higher constants and to rather poor concurrency. Vastly more man hours went into the optimization of BTree concurrency than hash concurrency.

Related Solutions

Postgresql – Create index on very large table with many shared values

For starters gid should probably be a numeric type. integer should be good enough or bigint if the key space shouldn't be big enough. Much smaller footprint, faster processing than with character data, faster and smaller indexes.

More importantly, to improve performance I suggest database normalization.

Quote:

There is a fairly regular pattern where each word appears about 1000 times.

Create a separate table for unique words:

CREATE TABLE word (
   word_id serial
 , word    text
);

Fill it with unique instances of word in your big_tbl:

INSERT INTO word (word)
SELECT DISTINCT word
FROM   big_tbl
ORDER  BY word;

ORDER BY is optional, not needed for query at hand. But it speeds up index creation and might be cheaper overall.

The table should be small in comparison: only ~ 50k rows for 50M rows in your big table.
Add indexes after filling the table:

ALTER TABLE word
    ADD CONSTRAINT word_word_uni UNIQUE (word) -- essential
  , ADD CONSTRAINT word_word_id_pkey PRIMARY KEY (word_id);  -- expendable?

If those are read-only tables, you can do without the pk. It's not relevant to the operations at hand.

Replace your big table with a much smaller new table. You may have to lock the big table to avoid concurrent writes. Concurrent reads are not a problem.

CREATE TABLE big_tbl_new AS
SELECT b.gid      -- or the suggested smaller, faster numeric replacement
     , w.word_id, b.stat
FROM   big_tbl b
JOIN   word w USING (word)
ORDER  BY word;   -- sorting by word helps query at hand

ORDER BY clusters the data (once) making the query at hand faster, because far fewer blocks have to be read (unless your data is clustered mostly already). The sort carries a cost, weigh cost and benefit once more.

DROP big_tbl;     -- make sure your new table has all data!
ALTER big_tbl_new RENAME TO big_tbl;

Recreate indexes:

ALTER TABLE big_tbl ADD CONSTRAINT big_tbl_gid_pkey PRIMARY KEY (gid);  -- expendable?
CREATE INDEX big_tbl_word_id_idx ON big_tbl (word_id);  -- essential

Your query looks like this now and should be faster:

SELECT b.*
FROM   word w
JOIN   big_tbl b USING (word_id)
WHERE  w.word = 'something';

Reorganization is meant to be a one-time operation to re-organize your data. Keep the new form and also consider keeping indexes permanently.

All of this together (including new indexes) should occupy about half of what you had before on disk, also cutting the time for creation in half (at least). Index creation should be considerably faster, the query as well. If RAM is a limiting factor, these modification pay double.

If you have to write to the table as well, it becomes more expensive (but you did not mention anything about that). You'd need to adjust your logic for DELETE / UPDATE / INSERT:
Example for INSERT: Fetch word_id for existing words or insert a new row in word returning the new word_id. Details for this:
How do I insert a row which contains a foreign key?

MySQL – Optimal indexing for a lookup table. HASH index, BTREE index or composite PK

a: Satisfy the WHERE with INDEX(something, time)
MyLookup -- If the pair (FKToTableA, FKToTableB) is unique, then make that the PRIMARY KEY, and put the columns in that order so that your SELECT can quickly get into MyLookup.
Don't use BIGINT (8 bytes) unless you expect to exceed 4 billion, the limit for INT UNSIGNED, which takes only 4 bytes.
IP addresses -- for the old IPv4, there are convenient routines for converting to INT UNSIGNED. For the new IPv6, it won't fit into BIGINT. See 5.6.3.
Do those Metadata columns need utf8? Can they be combined into a TEXT column? And other questions.
Build and maintain a "Summary table" rather than scanning large chunks of the "Fact" table to get this "report".
Do the SUMs before JOINing to b.
Do you really need a many-to-many mapping table? Seems like this is 1:many.
More tips on many-to-many.

Best Answer

Related Solutions

Postgresql – Create index on very large table with many shared values

MySQL – Optimal indexing for a lookup table. HASH index, BTREE index or composite PK

Related Question