PostgreSQL Performance – Optimize Huge Table for Fast Read

partitioningperformancepostgresql

I'm about to create a Postgres table with 1.5 billion rows. The table will just have a single TEXT column.

The table is effectively a blacklist. When a user of my software saves certain data, this table is looked-up to make sure the value they're saving doesn't exist in it.

What can I do to optimise Postgres or that table to make that "read" as fast as possible? The table will only ever be written to approximately once a year.

Best Answer

Add a second column which will hold a hash of the text value. Create the index on the hash. Even if there is a hash collision there will be only a few rows to read and perform a full comparison on the text vlaues.