PostgreSQL Performance – Speeding Up Creation of Partial Index

ddlindexperformanceperformance-tuningpostgresqlpostgresql-performance

I am trying to create partial indexes for a large (1.2TB), static table in Postgres 9.4.

My data is completely static, so I am able to insert all data, then create all indexes.

In this 1.2TB table, I have a column named run_id that cleanly divides the data. We've gotten great performance by creating indexes that cover a range of run_ids. Here's an example:

CREATE INDEX perception_run_frame_idx_run_266_thru_270
ON run.perception
(run_id, frame)
WHERE run_id >= 266 AND run_id <= 270;

These partial indexes give us the desired query speed. Unfortunately, the creation of each partial index takes about 70 minutes.

It looks like we are CPU limited (top is showing 100% for the process).
Is there anything I can do to speed up the creation of our partial indexes?

System specs:

18 core Xeon
192GB RAM
12 SSDs in RAID
Autovacuums are turned OFF
maintenance_work_mem: 64GB (Too high?)

Table specs:

Size: 1.26 TB
Number of rows: 10.537 Billion
Typical index size: 3.2GB (there is a ~.5GB variance)

Table definition:

CREATE TABLE run.perception(
id bigint NOT NULL,
run_id bigint NOT NULL,
frame bigint NOT NULL,
by character varying(45) NOT NULL,
by_anyone bigint NOT NULL,
by_me bigint NOT NULL,
by_s_id integer,
owning_p_id bigint NOT NULL,
obj_type_set bigint,
seq integer,
subj_id bigint NOT NULL,
subj_state_frame bigint NOT NULL,
CONSTRAINT perception_pkey PRIMARY KEY (id))

(Don't read too much into the column names — I've obfuscated them somewhat.)

Background info:

We have a separate team onsite that consumes this data, but really there are only one or two users. (This data is all generated via a simulation.) Users only start analyzing the data once inserts are finished and indexes are completely built. Our main concern is reducing the time required to generate usable data, and right now the bottleneck is index creation time.
Query speed has been completely adequate when using partials. In fact, I think we could increase the number of runs that each index covers, and still maintain good-enough query performance.
My guess is that we will have to partition the table. We are trying to exhaust all other options before taking that route.

Best Answer

BRIN index

Available since Postgres 9.5 and probably just what you are looking for. Much faster index creation, much smaller index. But queries are typically not as fast. The manual:

BRIN stands for Block Range Index. BRIN is designed for handling very large tables in which certain columns have some natural correlation with their physical location within the table. A block range is a group of pages that are physically adjacent in the table; for each block range, some summary info is stored by the index.

Read on, there is more.
Depesz ran a preliminiary test.

The optimum for your case: If you can write rows clustered on run_id, your index becomes very small and creation much cheaper.

CREATE INDEX foo ON run.perception USING brin (run_id, frame)
WHERE run_id >= 266 AND run_id <= 270;

You might even just index the whole table.

Table layout

Whatever else you do, you can save 8 bytes lost to padding due to alignment requirements per row by ording columns like this:

CREATE TABLE run.perception(
  id               bigint NOT NULL PRIMARY KEY
, run_id           bigint NOT NULL
, frame            bigint NOT NULL
, by_anyone        bigint NOT NULL
, by_me            bigint NOT NULL
, owning_p_id      bigint NOT NULL
, subj_id          bigint NOT NULL
, subj_state_frame bigint NOT NULL
, obj_type_set     bigint
, by_s_id          integer
, seq              integer
, by               varchar(45) NOT NULL -- or just use type text
);

Makes your table 79 GB smaller if none of the columns has NULL values. Details:

Configuring PostgreSQL for read performance

Also, you only have three columns that can be NULL. The NULL bitmap occupies 8 bytes for 9 - 72 columns. If only one integer column is NULL, there is a corner case for a storage paradox: it would be cheaper to use a dummy value instead: 4 bytes wasted but 8 bytes saved by not needing a NULL bitmap for the row. More details here:

How do completely empty columns in a large table affect performance?

Partial indexes

Depending on your actual queries it might be more efficient to have these five partial indices instead of the one above:

CREATE INDEX perception_run_id266_idx ON run.perception(frame) WHERE run_id = 266;
CREATE INDEX perception_run_id266_idx ON run.perception(frame) WHERE run_id = 267;
CREATE INDEX perception_run_id266_idx ON run.perception(frame) WHERE run_id = 268;
CREATE INDEX perception_run_id266_idx ON run.perception(frame) WHERE run_id = 269;
CREATE INDEX perception_run_id266_idx ON run.perception(frame) WHERE run_id = 270;

Run one transaction for each.

Removing run_id as index column this way saves 8 bytes per index entry - 32 instead of 40 bytes per row. Each index is also cheaper to create, but creating five instead of just one takes substantially longer for a table that's too big to stay in cache (like @Jürgen and @Chris commented). So that may or may not be useful for you.

Partitioning

Based on inheritance - the only option up to Postgres 9.5.
_{(The new declarative partitioning in Postgres 11 or, preferably, 12 is smarter.)}

The manual:

All constraints on all children of the parent table are examined during constraint exclusion, so large numbers of partitions are likely to increase query planning time considerably. So the legacy inheritance based partitioning will work well with up to perhaps a hundred partitions; don't try to use many thousands of partitions.

Bold emphasis mine. Consequently, estimating 1000 different values for run_id, you would make partitions spanning around 10 values each.

`maintenance_work_mem`

I missed that you are already adjusting for maintenance_work_mem in my first read. I'll leave quote and advice in my answer for reference. Per documentation:

maintenance_work_mem (integer)

Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY. It defaults to 64 megabytes (64MB). Since only one of these operations can be executed at a time by a database session, and an installation normally doesn't have many of them running concurrently, it's safe to set this value significantly larger than work_mem. Larger settings might improve performance for vacuuming and for restoring database dumps.

Note that when autovacuum runs, up to autovacuum_max_workers times this memory may be allocated, so be careful not to set the default value too high. It may be useful to control for this by separately setting autovacuum_work_mem.

I would only set it as high as needed - which depends on the unknown (to us) index size. And only locally for the executing session. As the quote explains, a too-high general setting can starve the server otherwise, because autovacuum may claim more RAM, too. Also, don't set it much higher than needed, even in the executing session, free RAM might be put to good use in caching data.

It could look like this:

BEGIN;

SET LOCAL maintenance_work_mem = 10GB;  -- depends on resulting index size

CREATE INDEX perception_run_frame_idx_run_266_thru_270 ON run.perception(run_id, frame)
WHERE run_id >= 266 AND run_id <= 270;

COMMIT;

About SET LOCAL:

The effects of SET LOCAL last only till the end of the current transaction, whether committed or not.

To measure object sizes:

Measure the size of a PostgreSQL table row

The server should generally be configured reasonably otherwise, obviously.

Related Solutions

Postgresql – Create index on a Postgres table as fast as possible

If it's an option, you could pre-sort the data externally, before database import.

In my test using PostgreSQL 9.6.1, a table with three double precision columns and 10M records with random values, creating an index on the same data, but preordered, shaved off 70% of index creation time:

db=# create table indexing_test_data (a varchar(50), b varchar(50), c real);       
CREATE TABLE
Time: 3,586 ms
db=# insert into indexing_test_data select random()::text, random()::text, random() from generate_series(1, 10000000);
INSERT 0 10000000
Time: 25590,475 ms
db=# select a, b, c into indexing_test_sorted from indexing_test_data order by a, b, c;
SELECT 10000000
Time: 77389,665 ms

db=# create index test_data_idx on indexing_test_data (a, b, c);
CREATE INDEX
Time: 57399,140 ms

db=# create index test_sorted_idx on indexing_test_sorted (a, b, c);
CREATE INDEX
Time: 16219,639 ms

An extra speedup can be had if you can afford to use a database with C locale and collation, which can then use a feature called abbreviated keys, available since PostgreSQL 9.5. This speeds up sorts up to 20x, but only works on C locale, due to buggy locale support in older libraries:

db=# create database testdb lc_collate "C" lc_ctype "C" template template0;
CREATE DATABASE
Time: 429,710 ms
db=# \c testdb
You are now connected to database "testdb" as user "user".
testdb=# create table indexing_test_data (a varchar(50), b varchar(50), c real);       
CREATE TABLE
Time: 2,794 ms
testdb=# insert into indexing_test_data select random()::text, random()::text, random() from generate_series(1, 10000000);
INSERT 0 10000000
Time: 25977,964 ms
testdb=# select a, b, c into indexing_test_sorted from indexing_test_data order by a, b, c;
SELECT 10000000
Time: 20794,850 ms

testdb=# create index test_data_idx on indexing_test_data (a, b, c);
CREATE INDEX
Time: 16371,426 ms

testdb=# create index test_sorted_idx on indexing_test_sorted (a, b, c);
CREATE INDEX
Time: 8046,787 ms

Here, indexing sorted data takes around 50% time of unsorted, but compared to first version, that took 57 seconds, you're down to 14% of time.

Other things, as mentioned in the comments, can help too: setting maintenance_work_mem appropriately and not running create index concurrently (which is significantly slower).

PostgreSQL – Partial Index on IS NULL Not Working

Assuming the central piece of information:

with ~15% of the rows having state = 'open' and closed IS NULL

is supposed to mean the same 15 % of all 1031584 rows meet both these conditions _{(all details matter!)}. Both conditions should perform equally - returning around 155k rows (!)

Your query plans show 37346 qualifying rows, ~ 3.6 % not 15 %. Something's still not right in your question.

With 3.6 %, indexes only start to make sense. Your tiny row size effectively occupies ~ 52 bytes per row, around 155 rows per page. That would be 5-6 hits per page for a completely random distribution. Postgres has to read all pages anyway and a sequential scan should be the fastest plan. Filtering the misses should be faster than involving indexes in any way.

Typically, qualifying rows are more or less clustered and the fewer data pages are involved the more sense it would make to involve an index. All bitmap index scans, though, I see hardly any case for an index scan. Even far less (as far as there can be much "less" than "hardly any") for the 15 % you claim.

For your updated numbers (~ 5% of all rows match) I would still rather expect bitmap index scans than index scans. A possible explanation: table bloat with lots of dead tuples. You mentioned a high write load. That would result in fewer and fewer live tuples per data page and favor index scans (as compared to bitmap index scans). You might re-test your initial query after a VACUUM FULL ANALYZE (if you can afford an exclusive lock on the table!). If my hypothesis holds, the physical table size would shrink substantially and you would then see a bitmap index scan instead of the index scan (and faster, too).

You may need more aggressive autovacuum settings. See:

Aggressive Autovacuum on PostgreSQL

Partial index

Your "expression index" and "partial expression index" are not useful. We don't need closed IS NULL as actual index expression (which is always true here). The expression only adds cost and no gain.

The first, plain partial index is the more useful variant. But don't use closed as index expression (again, always NULL here). Instead, use any column possibly useful to other queries, and ideally never updated to avoid additional cost and index bloat. The primary key column id is the natural candidate in absence of other useful applications:

CREATE INDEX closed_index ON tickets (id) WHERE closed IS NULL;

Or, if id is not useful, consider a constant instead:

CREATE INDEX closed_index ON tickets ((1)) WHERE closed IS NULL;

This makes the actual index column useless like in other dismissed variants - but it avoids all additional costs and dependencies. Related:

Get count estimates from pg_class.reltuples for given conditions

What I might try:

Updated to your updated question - only makes sense if you don't have many other writes on the the rows in question (The added columns updated and comments_count make me doubt it.)

*Create a partial index with id and the other relevant columns (few & small) as index expressions, and capitalize on it with a suitable query to get index-only scans:*

CREATE INDEX closed_index ON tickets (id, title, created, updated)
WHERE closed IS NULL;

VACUUM ANALYZE tickets;   -- just to prove idx-only is possible

SELECT id, title, created, updated
     , NULL::timestamp AS closed  -- redundant, rather drop it
FROM   tickets
WHERE  closed IS NULL;

We don't need SELECT *, closed IS NULL is given by the WHERE clause. So we can use the tiny partial index in a fast index-only scan - assuming you meet the preconditions (that's why I threw in VACUUM to update the visibility map up there). This is the rare case where queries reading more than ~ 5 % of all rows still happily use the index (even up to including the whole table).

There seems to be redundancy in your design, simplifications should be possible.

This works since Postgres 9.6, quoting the release notes:

Allow use of an index-only scan on a partial index when the index's WHERE clause references columns that are not indexed (Tomas Vondra, Kyotaro Horiguchi)

For example, an index defined by CREATE INDEX tidx_partial ON t(b) WHERE a > 0 can now be used for an index-only scan by a query that specifies WHERE a > 0 and does not otherwise use a. Previously this was disallowed because a is not listed as an index column.

Or the information in your question is misleading.

If you don't see index-only scans with this, even immediately after running VACUUM, then high write load may be in the way and the visibility map just never reaches a state that would allow index-only scans. The manual. Or you have another problem in your DB keeping VACUUM from doing its job. Related:

Postgres: “vacuum” command does not clean up dead tuples